Re: [R] problem with labeling plots, possibly in font defaults

2014-08-07 Thread Prof Brian Ripley
See 
http://cran.r-project.org/bin/macosx/RMacOSX-FAQ.html#I-see-no-text-in-a-Quartz-plot_0021


And the default font is not serif ... that FAQ says it is Arial, but I 
do not know if that is current.


Mac-specific questions to R-sig-mac please.  (This must be Mac-specific 
as the default device, quartz(), is.)


On 07/08/2014 23:24, David Winsemius wrote:


On Aug 7, 2014, at 12:59 PM, Tim Blass wrote:


Hello,

I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4.

I just tried making and labeling a plot as follows:


x<-rnorm(10)
y<-rnorm(10)
plot(x,y)
title(main="random points")


which produces a scatter plot of the random points, but without the title
and without any numbers along the axes. If I then run


par(family="sans")
plot(x,y,main="plot title")


the plot has the title and the numbers on the axes (also and 'x' and 'y'
appear as default labels for the axes).

I do not know what is going on, but maybe there is some problem in the
default font settings (I don't know if that could be an R issue or an issue
specific to my Mac)?


It hasn't happened to me recently (since updating from Leopard and SnowLeapard 
to Lion)  but it used to happen pretty frequently that I would get a duplicate 
font that printed empty square boxes. (I wasn't the only one. It got reported 
several times on R-SIG-Mac.)  One could fix that sort of problem by deleting 
the offending duplicate entry using Font Book.app

Since this is happening with the default serif font,  you would probably find 
the duplicate in Times. (It used to be happening to me with Symbol.)


quartzFonts()$serif

[1] "Times-Roman"  "Times-Bold"   "Times-Italic" "Times-BoldItalic"







This is clearly not a big problem (at least for now) since I can put labels
on plots by running par(), but if it is indicative of a larger underlying
problem (or if there is a simple fix) I would like to know.

Thank you!


David Winsemius
Alameda, CA, USA



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying Different Predictive Models over Different Geographic subsets of a RasterStack

2014-08-07 Thread Craig Aumann
Hi Mitchell,

It's a good suggestion, and I had thought of this, but one concern is that
the model fit at the end of the terminal nodes would need to be a
generalized linear model (the predicted outcomes are factors).  Given that
the models need to be fit automatically without human guidance, I'm not
keen on using a GLM with the types predictor variables I have...

mobForest is a take-off on mob, but my initial testing indicated that it
took a VERY long time to fit the models.

Thanks!
Craig



On Thu, Aug 7, 2014 at 8:10 PM, Mitchell Maltenfort 
wrote:

> I don't know this particular package well, but I believe "party" contains
> something called "mob" which creates a regression tree terminating in
> different models at each node.  Could that be adapted to your project?
>
>
> On Thursday, August 7, 2014, Craig Aumann  wrote:
>
>> I'm struggling with the best way to apply different predictive models over
>> different geographical areas of a raster stack.
>>
>> The context of the problem is that different predictive models are
>> developed within different polygonal regions of the overall study area.
>>  Each model needs to be used to predict an outcome for just the geographic
>> area for which it was developed.  Every pixel has one and only one
>> predictive model, but the model changes across different regions of the
>> landscape.  The models come from a "random forest" fit.
>>
>> The problem is that the rasterstack is rather large both in terms of
>> number
>> of pixels and also the number of layers which the predictive model needs
>> to
>> use.  If the problem were smaller, there are a number of things I could
>> "get away with" in terms of how I would do this, but given the problem
>> size, I need a more cunning solution.
>>
>> Ideally, I would like to only call predict from the package Raster just
>> once, and have the predict function call the right model based on the
>> geographical location of the pixel.  However, not clear that this is
>> possible with the Raster Package, or if it is possible how to implement it
>> efficiently.
>>
>> Any ideas or suggestions greatly appreciated.
>>
>> Cheers!
>> Craig
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> 
> Ersatzistician and Chutzpahthologist
>
> I can answer any question.  "I don't know" is an answer. "I don't know
> yet" is a better answer.
>
> "I can write better than anybody who can write faster, and I can write
> faster than anybody who can write better" AJ Leibling
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying Different Predictive Models over Different Geographic subsets of a RasterStack

2014-08-07 Thread Mitchell Maltenfort
I don't know this particular package well, but I believe "party" contains
something called "mob" which creates a regression tree terminating in
different models at each node.  Could that be adapted to your project?

On Thursday, August 7, 2014, Craig Aumann  wrote:

> I'm struggling with the best way to apply different predictive models over
> different geographical areas of a raster stack.
>
> The context of the problem is that different predictive models are
> developed within different polygonal regions of the overall study area.
>  Each model needs to be used to predict an outcome for just the geographic
> area for which it was developed.  Every pixel has one and only one
> predictive model, but the model changes across different regions of the
> landscape.  The models come from a "random forest" fit.
>
> The problem is that the rasterstack is rather large both in terms of number
> of pixels and also the number of layers which the predictive model needs to
> use.  If the problem were smaller, there are a number of things I could
> "get away with" in terms of how I would do this, but given the problem
> size, I need a more cunning solution.
>
> Ideally, I would like to only call predict from the package Raster just
> once, and have the predict function call the right model based on the
> geographical location of the pixel.  However, not clear that this is
> possible with the Raster Package, or if it is possible how to implement it
> efficiently.
>
> Any ideas or suggestions greatly appreciated.
>
> Cheers!
> Craig
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 

Ersatzistician and Chutzpahthologist

I can answer any question.  "I don't know" is an answer. "I don't know yet"
is a better answer.

"I can write better than anybody who can write faster, and I can write
faster than anybody who can write better" AJ Leibling

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Applying Different Predictive Models over Different Geographic subsets of a RasterStack

2014-08-07 Thread Craig Aumann
I'm struggling with the best way to apply different predictive models over
different geographical areas of a raster stack.

The context of the problem is that different predictive models are
developed within different polygonal regions of the overall study area.
 Each model needs to be used to predict an outcome for just the geographic
area for which it was developed.  Every pixel has one and only one
predictive model, but the model changes across different regions of the
landscape.  The models come from a "random forest" fit.

The problem is that the rasterstack is rather large both in terms of number
of pixels and also the number of layers which the predictive model needs to
use.  If the problem were smaller, there are a number of things I could
"get away with" in terms of how I would do this, but given the problem
size, I need a more cunning solution.

Ideally, I would like to only call predict from the package Raster just
once, and have the predict function call the right model based on the
geographical location of the pixel.  However, not clear that this is
possible with the Raster Package, or if it is possible how to implement it
efficiently.

Any ideas or suggestions greatly appreciated.

Cheers!
Craig

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Output from file.info()$mtime

2014-08-07 Thread Sarah Goslee
Are you aware you have so-called smart quotes in your R code? That can
cause all sorts of interesting errors, though I don't get the exact
one you report.

Sarah

On Thu, Aug 7, 2014 at 6:37 PM, Fisher Dennis  wrote:
> R 3.1.1
> OS X (and Windows)
>
> Colleagues
>
> I have some code that manages files.  Previously (as late as 3.1.0), the 
> command:
> file.info(FILENAME)$mtime == “”
> yielded T/F
>
> Now, it triggers an error:
> Error in as.POSIXlt.character(x, tz, ...) :
>   character string is not in a standard unambiguous format
>
> I looked through Peter Dalgaard’s list of changes in 3.1.1 and I cannot find 
> anything that would explain the change between versions.  I have fixed the 
> problem.  However, I am concerned that other problems may be lurking (i.e., 
> the changes might affect other commands).
>
> Of note, I ran:
> str(file.info(FILENAME)$mtime)
> in both versions of R and the results did not differ
>
> Can anyone explain what changed so that I can search my code efficiently?
>
> Thanks in advance.
>
> Dennis
>
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone: 1-866-PLessThan (1-866-753-7784)
> Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Output from file.info()$mtime

2014-08-07 Thread Fisher Dennis
R 3.1.1
OS X (and Windows)

Colleagues

I have some code that manages files.  Previously (as late as 3.1.0), the 
command:
file.info(FILENAME)$mtime == “”
yielded T/F

Now, it triggers an error:
Error in as.POSIXlt.character(x, tz, ...) : 
  character string is not in a standard unambiguous format

I looked through Peter Dalgaard’s list of changes in 3.1.1 and I cannot find 
anything that would explain the change between versions.  I have fixed the 
problem.  However, I am concerned that other problems may be lurking (i.e., the 
changes might affect other commands).

Of note, I ran:
str(file.info(FILENAME)$mtime)
in both versions of R and the results did not differ

Can anyone explain what changed so that I can search my code efficiently?  

Thanks in advance.

Dennis


Dennis Fisher MD
P < (The "P Less Than" Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-866-PLessThan (1-866-753-7784)
www.PLessThan.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with labeling plots, possibly in font defaults

2014-08-07 Thread David Winsemius

On Aug 7, 2014, at 12:59 PM, Tim Blass wrote:

> Hello,
> 
> I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4.
> 
> I just tried making and labeling a plot as follows:
> 
>> x<-rnorm(10)
>> y<-rnorm(10)
>> plot(x,y)
>> title(main="random points")
> 
> which produces a scatter plot of the random points, but without the title
> and without any numbers along the axes. If I then run
> 
>> par(family="sans")
>> plot(x,y,main="plot title")
> 
> the plot has the title and the numbers on the axes (also and 'x' and 'y'
> appear as default labels for the axes).
> 
> I do not know what is going on, but maybe there is some problem in the
> default font settings (I don't know if that could be an R issue or an issue
> specific to my Mac)?

It hasn't happened to me recently (since updating from Leopard and SnowLeapard 
to Lion)  but it used to happen pretty frequently that I would get a duplicate 
font that printed empty square boxes. (I wasn't the only one. It got reported 
several times on R-SIG-Mac.)  One could fix that sort of problem by deleting 
the offending duplicate entry using Font Book.app

Since this is happening with the default serif font,  you would probably find 
the duplicate in Times. (It used to be happening to me with Symbol.)

> quartzFonts()$serif
[1] "Times-Roman"  "Times-Bold"   "Times-Italic" "Times-BoldItalic"
>


> 
> This is clearly not a big problem (at least for now) since I can put labels
> on plots by running par(), but if it is indicative of a larger underlying
> problem (or if there is a simple fix) I would like to know.
> 
> Thank you!

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Legend in ggplot2

2014-08-07 Thread Federico Lasa
The problem is that you are not actually 'mapping' any variables to
the fill and colour aestethics so ggplot wont produce legends for
those. I'm not sure ggplots are appropiate for what you're trying to
do here but you can sure hack around it a bit, for instance try:

ggplot(tabu, aes(x=weeks, y=T))+
  scale_y_continuous(expand=c(0,0),

minor_breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)),

round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)),
  1),
 breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)),
round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)),
2))+
  scale_x_discrete(expand=c(0,0),
   breaks=seq(min(tabu$weeks),
  max(tabu$weeks)))+
  geom_bar(data=tabu, aes(y=Tupper, fill="Tupper"),stat="identity")+
  geom_point(aes(y=cusums, colour="Cusum"),size=4,pch=15)+
  geom_bar(data=tabu, aes(y=Tlower, fill="Tlower"),stat="identity")+
  geom_hline(aes(yintercept=0),colour="gray20",size=1)+
  geom_hline(aes(yintercept=5),colour="darkorchid4",size=2,alpha=1/2)+
  geom_hline(aes(yintercept=-5),colour="darkorchid4",size=2,alpha=1/2)+
  geom_hline(aes(yintercept=0.5),colour="gold2",size=2,alpha=1/1.3)+
  geom_hline(aes(yintercept=-0.5),colour="gold2",size=2,alpha=1/1.3)+
  scale_fill_manual(name="Legend",
  breaks=c("Tupper","Tlower"),
  values=c("brown3","darkolivegreen4"),
  labels=c("T","L"))+
  scale_colour_manual(name="Legend",
breaks=c("Cusum"),
values=c("dodgerblue1"),
labels=c("Cusum"))

and fill in the balnks

P.S. your plot is very strange.


On Thu, Aug 7, 2014 at 9:07 AM, Pavneet Arora
 wrote:
> Hi All
>
> Following is my dataset.
> dput(tabu)
> structure(list(weeks = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
> 29, 30), values = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04,
> 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62,
> 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62,
> 11.31, 10.52), deviation = c(-0.551, -2.01,
> -0.711,
> 1.66, 2.16, 0.18, -1.96, 1.46, -0.801, 0.34,
> -0.971,
> 1.47, 0.51, -0.6, 0.0801, -0.631,
> 0.619,
> 0.31, 0, 3, 0.9, -0.67, 2.29, 1.5, 0.6, 1.08, 0.381,
> 1.62, 1.31, 0.52), cusums = c(-0.551, -2.56, -3.27,
> -1.61, 0.549, 0.729, -1.23, 0.229,
> -0.572, -0.232, -1.2, 0.268,
> 0.778, 0.178, 0.258,
> -0.373,
> 0.246, 0.557, 0.557, 3.56,
> 4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09
> ), Tupper = c(0, 0, 0, 1.16, 2.82, 2.5, 0.0391, 1,
> 0, 0, 0, 0.971, 0.98, 0, 0, 0, 0.119,
> 0, 0, 2.5, 2.9, 1.73, 3.52, 4.52, 4.62, 5.2, 5.08, 6.2, 7.01,
> 7.03), Tlower = c(-0.0507, -1.56, -1.77, 0, 0, 0,
> -1.46, 0, -0.301, 0, -0.471, 0, 0,
> -0.0996,
> 0, -0.131, 0, 0, 0, 0, 0, -0.17, 0, 0, 0, 0, 0, 0,
> 0, 0)), .Names = c("weeks", "values", "deviation", "cusums",
> "Tupper", "Tlower"), row.names = c(NA, -30L), class = "data.frame")
>
> I have created a plot using ggplot, whereby it makes both barchart and
> plots points, like following:
> ggplot(tabu,aes(x=week,
>  ymin=min(tabu$cusums,tabu$Tupper,tabu$Tlower),
>  ymax=max(tabu$cusums,tabu$Tupper,tabu$Tlower)))+
>   labs(x=NULL,y=NULL)+
>   scale_y_continuous(expand=c(0,0),
>  minor_breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)),
>  round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)),
>   1),
>  breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)),
>  round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)),
> 2))+
>   scale_x_discrete(expand=c(0,0),
>breaks=seq(min(tabu$week),
>   max(tabu$week)))+
>   geom_bar(aes(y=tabu$Tupper),stat="identity",fill="brown3")+
>   geom_bar(aes(y=tabu$Tlower),stat="identity",fill="darkolivegreen4")+
>   geom_point(aes(y=tabu$cusums),size=4,pch=15,colour="dodgerblue1")+
>   geom_hline(aes(yintercept=0),colour="gray20",size=1)+ #geom_hline -
> draws a reference line at 0
>   geom_hline(aes(yintercept=5),colour="darkorchid4",size=2,alpha=1/2)+
> #Out-Of-Signal Lines
>   geom_hline(aes(yintercept=-5),colour="darkorchid4",size=2,alpha=1/2)+
> #Out-Of-Signal Lines
>   geom_hline(aes(yintercept=0.5),colour="gold2",size=2,alpha=1/1.3)+ #K
>   geom_hline(aes(yintercept=-0.5),colour="gold2",size=2,alpha=1/1.3)+ #K
>   scale_color_manual(name="Legend",
>  breaks=c("Tupper","Tlowe

[R] problem with labeling plots, possibly in font defaults

2014-08-07 Thread Tim Blass
Hello,

I am using R 3.1.1 on a (four year old) MacBook, running OSX 10.9.4.

I just tried making and labeling a plot as follows:

> x<-rnorm(10)
> y<-rnorm(10)
> plot(x,y)
> title(main="random points")

which produces a scatter plot of the random points, but without the title
and without any numbers along the axes. If I then run

> par(family="sans")
> plot(x,y,main="plot title")

the plot has the title and the numbers on the axes (also and 'x' and 'y'
appear as default labels for the axes).

I do not know what is going on, but maybe there is some problem in the
default font settings (I don't know if that could be an R issue or an issue
specific to my Mac)?

This is clearly not a big problem (at least for now) since I can put labels
on plots by running par(), but if it is indicative of a larger underlying
problem (or if there is a simple fix) I would like to know.

Thank you!

tb

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issues in using GP_fit() of GPfit package

2014-08-07 Thread Jason Donnald
Hi,

I am having certain issues in using GP_fit() of GPfit package. Basically
each time I call this function, it gets hanged. What I have is a data
frame(matrix) which has 2 columns  and some 2000 rows and first I normalize
the values in the range [0,1]. I also have a output status vector which has
1 column and same 2000 rows and the values are either 0 or 1. Now whenever
I pass these two inside the GP_fit() , like GP_fit(mat_norm,mat_status),
the R console gets hanged. Is it because of 2000 rows?I had assumed that
2000 is a very short number and it should not get hanged for such a small
number when the real data can have millions of rows. Any idea what might be
wrong here?

Thanks in advance.

Regards,
Jason

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dose response glmer

2014-08-07 Thread Marcelo Laia
I am trying to do a dose response in my dataset, but nothing go a head.
I am adapting a script shared on the web, but I unable to make it
useful for my dataset. I would like to got the LC50 for each Isolado
and if there are differences between then.
 
My data is https://dl.dropboxusercontent.com/u/34009642/R/dead_alive.csv

Here what I copy and try to modifying:

library(plyr)
library(lattice)
library(lme4)
library(arm)
library(lmerTest)
library(faraway)
library(car)

## Conc are concentration. I input only the coef, but, all, 
## except 0, that is my control (without Isolado), are base
## 10. i.e: 10^4, 10^6 e 10^8.

data <- read.table("dead_alive.csv", sep = "\t", dec=",", header = TRUE)

data$Rep <- factor(data$Rep)

mean_data <- ddply(data, c("Isolado", "Conc", "Day"), numcolwise(mean))

xyplot(Dead/(Dead + Live) ~ Conc|Isolado, groups = Day, type = "l",
ylab='Probability', xlab='Dose', data = mean_data)

xyplot(Dead/(Dead + Live) ~ Day|Isolado, groups = Conc, type = "l",
ylab='Probability', xlab='Dose', data = mean_data)

model.logit <- glmer(cbind(Dead, Live) ~ -1 + Isolado + Isolado:Conc +
(0 + Conc|Day), family=binomial, data = data)

Anova(model.logit)
summary(model.logit)

model.probit <- glmer(cbind(Dead, Live) ~  Isolado + Isolado:Conc + (0
+ Conc|Day), family=binomial(link=probit), data=data)

model.cloglog <- glm(cbind(Dead, Live) ~ Isolado + Isolado:Conc + (1 +
Conc|Day), family=binomial(link=cloglog), data=data)

x <- seq(0,8, by=0.2)

prob.logit <- ilogit(model.logit$coef[1] + model.logit$coef[2]*x)
prob.probit <- pnorm(model.probit$coef[1] + model.probit$coef[2]*x)
prob.cloglog <-  1-exp(-exp((model.cloglog$coef[1] +
model.cloglog$coef[2]*x)))

with(subdata, plot(Dead/(Dead + Live) ~ Conc, group = Day, )

lines(x, prob.logit) # solid curve = logit
lines(x, prob.probit, lty=2) # dashed = probit
lines(x, prob.cloglog, lty=5) # longdash = c-log-log

plot(x, prob.logit, type='l', ylab='Probability', xlab='Dose') # solid
curve = logit
lines(x, prob.probit, lty=2) # dashed = probit
lines(x, prob.cloglog, lty=5) # longdash = c-log-log
matplot(x, cbind(prob.probit/prob.logit,
(1-prob.probit)/(1-prob.logit)), type='l', xlab='Dose', ylab='Ratio')
matplot(x, cbind(prob.cloglog/prob.logit,
(1-prob.cloglog)/(1-prob.logit)), type='l', xlab='Dose', ylab='Ratio')

model.logit.data <- glm(cbind(Dead,Live) ~ Conc, family=binomial,
data=data)
pred2.5 <- predict(model.logit.data, newdata=data.frame(Conc=2.5), se=T)
ilogit(pred2.5$fit)
ilogit(c(pred2.5$fit - 1.96*pred2.5$se.fit, pred2.5$fit +
1.96*pred2.5$se.fit))
## what are this 1.96 Where it come from?

### If there are several predictors, just put in the code
### above something like:
### newdata=data.frame(conc=2.5,x2=4.6,x3=5.8)
### or whatever is the desired set of predictor values...


### Effective Dose calculation:
# What is the concentration that yields a probability of 0.5 of an
# insect dying?

library(MASS)
dose.p(model.logit.data, p=0.5)

# A 95% CI for the ED50:

c(2 - 1.96*0.1466921, 2 + 1.96*0.1466921)

# What is the concentration that yields a probability of 0.8 of an
# insect dying?

dose.p(model.logit.data, p=0.8)
 
-- 
Laia, ML

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] big data?

2014-08-07 Thread Spencer Graves
correcting a typo (400 MB, not GB.  Thanks to David Winsemius for 
reporting it).  Spencer



###


  Thanks to all who replied.  For the record, I will summarize here 
what I tried and what I learned:



  Mike Harwood suggested the ff package.  David Winsemius suggested 
data.table and colbycol.  Peter Langfelder suggested sqldf.



  sqldf::read.csv.sql allowed me to create an SQL command to read a 
column or a subset of the rows of a 400 MB tab-delimited file in roughly 
a minute on a 2.3 GHz dual core machine running Windows 7 with 8 GB RAM. 
 It also read a column of a 1.3 GB file in 4 minutes.  The 
documentation was sufficient to allow me to easily get what I wanted 
with a minimum of effort.



  If I needed to work with these data regularly, I might experiment 
with colbycol and ff:  The documentation suggested to me that these 
packages might allow me to get quicker answers from routine tasks after 
some preprocessing.  Of course, I could also do the preprocessing 
manually with sqldf.



  Thanks, again.
  Spencer


On 8/6/2014 9:39 AM, Mike Harwood wrote:

The read.table.ffdf function in the ff package can read in delimited files
and store them to disk as individual columns.  The ffbase package provides
additional data management and analytic functionality.  I have used these
packages on 15 Gb files of 18 million rows and 250 columns.


On Tuesday, August 5, 2014 1:39:03 PM UTC-5, David Winsemius wrote:


On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote:


  What tools do you like for working with tab delimited text files up

to 1.5 GB (under Windows 7 with 8 GB RAM)?

?data.table::fread


  Standard tools for smaller data sometimes grab all the available

RAM, after which CPU usage drops to 3% ;-)


  The "bigmemory" project won the 2010 John Chambers Award but "is

not available (for R version 3.1.0)".


  findFn("big data", 999) downloaded 961 links in 437 packages. That

contains tools for data PostgreSQL and other formats, but I couldn't find
anything for large tab delimited text files.


  Absent a better idea, I plan to write a function getField to

extract a specific field from the data, then use that to split the data
into 4 smaller files, which I think should be small enough that I can do
what I want.

There is the colbycol package with which I have no experience, but I
understand it is designed to partition data into column sized objects.
#--- from its help file-
cbc.get.col {colbycol}R Documentation
Reads a single column from the original file into memory

Description

Function cbc.read.table reads a file, stores it column by column in disk
file and creates a colbycol object. Functioncbc.get.col queries this object
and returns a single column.


  Thanks,
  Spencer

__
r-h...@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
r-h...@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] big data?

2014-08-07 Thread Spencer Graves
  Thanks to all who replied.  For the record, I will summarize here 
what I tried and what I learned:



  Mike Harwood suggested the ff package.  David Winsemius suggested 
data.table and colbycol.  Peter Langfelder suggested sqldf.



  sqldf::read.csv.sql allowed me to create an SQL command to read a 
column or a subset of the rows of a 400 GB tab-delimited file in roughly 
a minute on a 2.3 GHz dual core machine running Windows 7 with 8 GB 
RAM.  It also read a column of a 1.3 GB file in 4 minutes.  The 
documentation was sufficient to allow me to easily get what I wanted 
with a minimum of effort.



  If I needed to work with these data regularly, I might experiment 
with colbycol and ff:  The documentation suggested to me that these 
packages might allow me to get quicker answers from routine tasks after 
some preprocessing.  Of course, I could also do the preprocessing 
manually with sqldf.



  Thanks, again.
  Spencer


On 8/6/2014 9:39 AM, Mike Harwood wrote:

The read.table.ffdf function in the ff package can read in delimited files
and store them to disk as individual columns.  The ffbase package provides
additional data management and analytic functionality.  I have used these
packages on 15 Gb files of 18 million rows and 250 columns.


On Tuesday, August 5, 2014 1:39:03 PM UTC-5, David Winsemius wrote:


On Aug 5, 2014, at 10:20 AM, Spencer Graves wrote:


  What tools do you like for working with tab delimited text files up

to 1.5 GB (under Windows 7 with 8 GB RAM)?

?data.table::fread


  Standard tools for smaller data sometimes grab all the available

RAM, after which CPU usage drops to 3% ;-)


  The "bigmemory" project won the 2010 John Chambers Award but "is

not available (for R version 3.1.0)".


  findFn("big data", 999) downloaded 961 links in 437 packages. That

contains tools for data PostgreSQL and other formats, but I couldn't find
anything for large tab delimited text files.


  Absent a better idea, I plan to write a function getField to

extract a specific field from the data, then use that to split the data
into 4 smaller files, which I think should be small enough that I can do
what I want.

There is the colbycol package with which I have no experience, but I
understand it is designed to partition data into column sized objects.
#--- from its help file-
cbc.get.col {colbycol}R Documentation
Reads a single column from the original file into memory

Description

Function cbc.read.table reads a file, stores it column by column in disk
file and creates a colbycol object. Functioncbc.get.col queries this object
and returns a single column.


  Thanks,
  Spencer

__
r-h...@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
r-h...@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] K-nearest neighbor

2014-08-07 Thread Robert U
Dear R-users,

I am looking for a weighted knn-search function, but i cannot manage to find 
one. There are several options of weighted knn classifiers, but i would rather 
use a simple 'search function' (such as get.knnx). Anyone knows a search 
function with "weight" option ?

Thanks
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread William Dunlap
I prefer the idiom
  c(TRUE, a[-1] != a[-length(x)])
because it works for character and other data types as well.

I also find that thinking in terms of runs instead of subscripting
tricks is easier.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread Bart Kastermans
Better:
b <- c(a[1]-1,a[-length(a)])

On 07 Aug 2014, at 17:28, Bart Kastermans  wrote:

> For readability I like:
> 
>> b <- c(0,a[-length(a)])
>> which(a != b & a == 0)
> [1]  4 12 18
>> which(a != b & a == 1)
> [1]  1  6 16 23
> 
> 
> On 07 Aug 2014, at 17:23, William Dunlap  wrote:
> 
>> My solution may be a bit clearer if you define the function isFirstInRun
>> isFirstInRun <- function(x) {
>>  if (length(x) == 0) {
>> logical(0)
>>  } else {
>> c(TRUE, x[-1] != x[-length(x)])
>>  }
>> }
>> 
>> Then that solution is equivalent to
>>  which(isFirstInRun(a) & a==1)
>> 
>> If 'a' contains NA's then you have to decide how to deal with them.
>> 
>> (The call to 'which' is not needed if you are going to be using the
>> result as a subscript.)
>> 
>> You may also want isLastInRun
>> isLastInRun <- function(x) {
>>  if (length(x) == 0) {
>> logical(0)
>>  } else {
>> c(x[-1] != x[-length(x)], TRUE)
>>  }
>> }
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>> 
>> 
>> On Thu, Aug 7, 2014 at 7:36 AM, William Dunlap  wrote:
 a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1)
 which( a==1 & c(TRUE, a[-length(a)]!=1) )
>>> [1]  1  6 16 23
 which( a==0 & c(TRUE, a[-length(a)]!=0) )
>>> [1]  4 12 18
>>> 
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>> 
>>> 
>>> On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz  wrote:
 Hello,everybody,
I have a sequence,like 
 a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how
  to get the position of each first 1 and 0, that's to say, how to get 
 b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0.
   Many thanks!
 Johnny
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread Bart Kastermans
For readability I like:

> b <- c(0,a[-length(a)])
> which(a != b & a == 0)
[1]  4 12 18
> which(a != b & a == 1)
[1]  1  6 16 23


On 07 Aug 2014, at 17:23, William Dunlap  wrote:

> My solution may be a bit clearer if you define the function isFirstInRun
> isFirstInRun <- function(x) {
>   if (length(x) == 0) {
>  logical(0)
>   } else {
>  c(TRUE, x[-1] != x[-length(x)])
>   }
> }
> 
> Then that solution is equivalent to
>   which(isFirstInRun(a) & a==1)
> 
> If 'a' contains NA's then you have to decide how to deal with them.
> 
> (The call to 'which' is not needed if you are going to be using the
> result as a subscript.)
> 
> You may also want isLastInRun
> isLastInRun <- function(x) {
>   if (length(x) == 0) {
>  logical(0)
>   } else {
>  c(x[-1] != x[-length(x)], TRUE)
>   }
> }
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> 
> On Thu, Aug 7, 2014 at 7:36 AM, William Dunlap  wrote:
>>> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1)
>>> which( a==1 & c(TRUE, a[-length(a)]!=1) )
>> [1]  1  6 16 23
>>> which( a==0 & c(TRUE, a[-length(a)]!=0) )
>> [1]  4 12 18
>> 
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>> 
>> 
>> On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz  wrote:
>>> Hello,everybody,
>>> I have a sequence,like 
>>> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how 
>>> to get the position of each first 1 and 0, that's to say, how to get 
>>> b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0.
>>>Many thanks!
>>> Johnny
>>>[[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread William Dunlap
My solution may be a bit clearer if you define the function isFirstInRun
isFirstInRun <- function(x) {
   if (length(x) == 0) {
  logical(0)
   } else {
  c(TRUE, x[-1] != x[-length(x)])
   }
}

Then that solution is equivalent to
   which(isFirstInRun(a) & a==1)

If 'a' contains NA's then you have to decide how to deal with them.

(The call to 'which' is not needed if you are going to be using the
result as a subscript.)

You may also want isLastInRun
isLastInRun <- function(x) {
   if (length(x) == 0) {
  logical(0)
   } else {
  c(x[-1] != x[-length(x)], TRUE)
   }
}
Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Aug 7, 2014 at 7:36 AM, William Dunlap  wrote:
>> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1)
>> which( a==1 & c(TRUE, a[-length(a)]!=1) )
> [1]  1  6 16 23
>> which( a==0 & c(TRUE, a[-length(a)]!=0) )
> [1]  4 12 18
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz  wrote:
>> Hello,everybody,
>>  I have a sequence,like 
>> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how 
>> to get the position of each first 1 and 0, that's to say, how to get 
>> b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0.
>> Many thanks!
>> Johnny
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread peter dalgaard

On 07 Aug 2014, at 11:16 , jim holtman  wrote:

> rle

...with a little tinkering, like

> m <- c(1,cumsum(rle(a)$lengths)+1)
> m
[1]  1  4  6 12 16 18 23 34

then look at every 2nd element, discarding the last.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread William Dunlap
> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1)
> which( a==1 & c(TRUE, a[-length(a)]!=1) )
[1]  1  6 16 23
> which( a==0 & c(TRUE, a[-length(a)]!=0) )
[1]  4 12 18

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Aug 6, 2014 at 7:12 PM, Johnnycz  wrote:
> Hello,everybody,
>  I have a sequence,like 
> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how 
> to get the position of each first 1 and 0, that's to say, how to get 
> b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0.
> Many thanks!
> Johnny
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Legend in ggplot2

2014-08-07 Thread Pavneet Arora
Hi All

Following is my dataset.
dput(tabu)
structure(list(weeks = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 
29, 30), values = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04, 
11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62, 
10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62, 
11.31, 10.52), deviation = c(-0.551, -2.01, 
-0.711, 
1.66, 2.16, 0.18, -1.96, 1.46, -0.801, 0.34, 
-0.971, 
1.47, 0.51, -0.6, 0.0801, -0.631, 
0.619, 
0.31, 0, 3, 0.9, -0.67, 2.29, 1.5, 0.6, 1.08, 0.381, 
1.62, 1.31, 0.52), cusums = c(-0.551, -2.56, -3.27, 
-1.61, 0.549, 0.729, -1.23, 0.229, 
-0.572, -0.232, -1.2, 0.268, 
0.778, 0.178, 0.258, 
-0.373, 
0.246, 0.557, 0.557, 3.56, 
4.46, 3.79, 6.08, 7.58, 8.18, 9.26, 9.64, 11.26, 12.57, 13.09
), Tupper = c(0, 0, 0, 1.16, 2.82, 2.5, 0.0391, 1, 
0, 0, 0, 0.971, 0.98, 0, 0, 0, 0.119, 
0, 0, 2.5, 2.9, 1.73, 3.52, 4.52, 4.62, 5.2, 5.08, 6.2, 7.01, 
7.03), Tlower = c(-0.0507, -1.56, -1.77, 0, 0, 0, 
-1.46, 0, -0.301, 0, -0.471, 0, 0, 
-0.0996, 
0, -0.131, 0, 0, 0, 0, 0, -0.17, 0, 0, 0, 0, 0, 0, 
0, 0)), .Names = c("weeks", "values", "deviation", "cusums", 
"Tupper", "Tlower"), row.names = c(NA, -30L), class = "data.frame")

I have created a plot using ggplot, whereby it makes both barchart and 
plots points, like following:
ggplot(tabu,aes(x=week,
 ymin=min(tabu$cusums,tabu$Tupper,tabu$Tlower),
 ymax=max(tabu$cusums,tabu$Tupper,tabu$Tlower)))+
  labs(x=NULL,y=NULL)+
  scale_y_continuous(expand=c(0,0),
 minor_breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)),
 round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)),
  1),
 breaks=seq(round(min(tabu$cusums,tabu$Tupper,tabu$Tlower)),
 round(max(tabu$cusums,tabu$Tupper,tabu$Tlower)),
2))+
  scale_x_discrete(expand=c(0,0),
   breaks=seq(min(tabu$week),
  max(tabu$week)))+
  geom_bar(aes(y=tabu$Tupper),stat="identity",fill="brown3")+
  geom_bar(aes(y=tabu$Tlower),stat="identity",fill="darkolivegreen4")+
  geom_point(aes(y=tabu$cusums),size=4,pch=15,colour="dodgerblue1")+
  geom_hline(aes(yintercept=0),colour="gray20",size=1)+ #geom_hline - 
draws a reference line at 0
  geom_hline(aes(yintercept=5),colour="darkorchid4",size=2,alpha=1/2)+ 
#Out-Of-Signal Lines
  geom_hline(aes(yintercept=-5),colour="darkorchid4",size=2,alpha=1/2)+ 
#Out-Of-Signal Lines
  geom_hline(aes(yintercept=0.5),colour="gold2",size=2,alpha=1/1.3)+ #K 
  geom_hline(aes(yintercept=-0.5),colour="gold2",size=2,alpha=1/1.3)+ #K
  scale_color_manual(name="Legend",
 breaks=c("Tupper","Tlower","CuSum","±K","±H"),
 values=c("brown3","darkolivegreen4","dodgerblue1","gold2","darkorchid4"),
 labels=c("T","L","C","±K","±H"))

However, I am having trouble getting  a legend. I know its supposed to do 
something with melt function in reshape package. But I can?t get it too 
work! Can someone please help me with this.
***
MORE TH>N is a trading style of Royal & Sun Alliance Insurance plc (No. 93792). 
Registered in England and Wales at St. Mark’s Court, Chart Way, Horsham, West 
Sussex, RH12 1XL. 

Authorised by the Prudential Regulation Authority and regulated by the 
Financial Conduct Authority and the Prudential Regulation Authority.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dynamic runSum

2014-08-07 Thread amarjit chandhial
Hello,
runSum calculates a running sum looking back a fixed distance n, e.g. 20.  
How do I calculate a dynamic runSum function for an xts object?
In 
otherwords, I want to calculate a running sum at each point in time 
looking back a variable distance. In this example, values governed by 
the vector VL.  
Here's a minimum reproducible example:
 
 library(quantstrat)
symbols = c('^GSPC') 
 
 start.date <- as.Date("2010-01-01")
 end.date <- as.Date("2013-12-31")
 
 getSymbols(symbols, from=as.character(start.date), 
to=as.character(end.date),adjust=T) 
 
 "acF1" <- function(x, n1=5, n2=10, n3=20, nacF1=25, n0=20, ...) {
var1 <- x - lag(x,1,na.pad=T)
var2 <- runSD(x, n=n1, sample=TRUE, cumulative=FALSE) 
var3 <- runMean(var2, n=n2, cumulative=FALSE) 
VL <- ifelse( trunc(n3/(var2/var3))>nacF1, nacF1, trunc(n3/(var2/var3)))
p_pos <- ifelse(var1>=0, var1, 0)
out1 <- runSum(p_pos,  n=n0, cumulative=FALSE)
 
res <- cbind(var1, var2, var3, VL, p_pos, out1)
colnames(res) <- c("var1","var2","var3","VL", "p_pos", "out1")
 
reclass(res)
 }
 
 
 acf1 <- acF1( GSPC[,c("GSPC.Close")], n1=5, n2=10, n3=20, nacF1=25, n0=20)
 acf1
 
 
So on 
2010-02-02, I want runSum to be looking back 23 points as governed by VL , not 
20 points
2010-02-03, I want runSum to be looking back 24 points as governed by VL,  not 
20 points
 etc etc 
 2013-12-31, I want runSum to be looking back 25 points as governed by VL, not 
20 points 
 
 
Amarjit
 
 
 __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Some hpc problems with doMPI and runmpi (I know there is R-SIG-HPC)

2014-08-07 Thread André Ziervogel
Dear R people,

I’ve been doing some hpc using R and openmpi. Unfortunately I’ve encoutred a 
major problem and it’s nature is hard to pin down:

Essentially I call mpirun Rscipt … as soon as the script reaches a 
foreach()%dopar% it halts indefinitely. I’ve attached the qsub script:

#!/bin/bash
#$ -S /bin/bash
#$ -N test_14
#$ -cwd

#$ -V  
#$ -o "/fhgfs/g61570/Spectral Databases/log/test_14_"$JOB_ID 
#$ -j y
#$ -q regular
#$ -pe openmpi 8
#$ -l h_rt=00:15:00
#$ -l h_vmem=1.9G  
#$ -m eas  
#$ -M andre.ziervo...@psychol.uni-giessen.de

module add gcc 
module add openmpi/gcc/64/1.6.5
module add R/gcc/3.0.1

date   #log start time

echo "Number of slots " . $NSLOTS

mpirun Rscript /fhgfs/g61570/Spectral\ Databases/test_10.r > 
/fhgfs/g61570/Spectral\ Databases/log/test_14.Rout

date

exit

and the R file:

suppressMessages(library('doMPI'))

skylla.cluster <- startMPIcluster()
registerDoMPI(skylla.cluster)

cat(paste("COMM SIZE: ", mpi.comm.size(0), " cluster size: ", 
clusterSize(skylla.cluster), "\n",sep = ""))

tmp.time <- proc.time()
sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) %do%
{
r <- sqrt(i^2 + i^2) + .Machine$double.eps  * factorial(i)
sin(r) / r
}
cat(paste("Processing seriell time: ", "\n", sep = "  "))
print(proc.time() - tmp.time)
#print(sample)

tmp.time <- proc.time()
sample <- foreach(i=seq(from=0, to=1000, by =1),.combine='c',.inorder=TRUE) 
%dopar%
{
r <- sqrt(i^2 + i^2) + .Machine$double.eps  * factorial(i)
sin(r) / r
}
cat(paste("Processing parallel time: ", "\n", sep = "  "))
print(proc.time() - tmp.time)
#print(sample)

closeCluster(skylla.cluster)
# mpi.close.Rslaves()
# mpi.exit()
mpi.quit(save='no‘) 

Any suggestions would be highly appreciated! Thanks!

Best

André

--
Dipl. Psych André Ziervogel
andre.ziervo...@psychol.uni-giessen.de
--



signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted network centrality measures by network size

2014-08-07 Thread Suzen, Mehmet
Hi Jenny,

Have you tried igraph before?  See, http://igraph.org/r/doc/
There are couple of centrality measures there.

Best,
-m



On 6 August 2014 02:50, Jenny Jiang  wrote:
> Dear R-help,
>
> My name is Jenny Jiang and I am a Finance Honours research
>  student from the University of New South Wales Australia. Currently my
> research project involves the calculating of some network centrality
> measures in R, which are degree, closeness, betweenness and eigenvector. 
> However I am having some issue regarding to the calculation of
> the weighted centrality measures by network size. For example, currently
>  my code allows me to calculate centrality measures for each firm year,
> and now I would like to calculate centrality measures weighted by the
> firm network size for each firm year.
>
> My current code is like the following:
>
> install.packages("statnet")
>
> library(statnet)
>
> #read csv
> data <- read.csv("D:\\Users\\z3377013\\Desktop\\networknew1.csv",header=TRUE)
> #companies <- unique(data$CompanyID_)
> #years <- unique(data$Year)
> pairs <- unique(data[,c(1,3)])
> #directors <- unique(c(data$DirectorID_,data$DirectorID_Connected))
> #director_map <- 1:length(directors)
> #names(director_map) <- c(as.character(directors))
>
> #for (i in 1:nrow(data)) {
> #  data[i,2] = director_map[as.character(data[i,2])]
> #  data[i,4] = director_map[as.character(data[i,4])]
> #}
>
> sink("D:\\Users\\z3377013\\Desktop\\measure1.csv")
> for (i in 1:nrow(pairs)) {
>   d <- subset(data, CompanyID_==pairs[i,1]&Year==pairs[i,2])
>   directors <- unique(c(d$DirectorID_,d$DirectorID_Connected))
>   director_map <- 1:length(directors)
>   names(director_map) <- c(as.character(directors))
>   for (j in 1:nrow(d)) {
> d[j,2] = director_map[as.character(d[j,2])]
> d[j,4] = director_map[as.character(d[j,4])]
>   }
>
>   net<-network(d[,c(2,4)],directed=F,loops=F,matrix.type="edgelist")
>
>   degree <- degree(net, cmode="freeman", gmode="graph")
>   closeness <- closeness(net,gmode="graph",cmode="undirected")
>   betweenness <- betweenness(net,gmode="graph",cmode="undirected")
>   evcent <- evcent(net,gmode="graph",use.eigen=TRUE)
>
>   write.csv(cbind(pairs[i,], directors, degree, closeness, betweenness, 
> evcent), row.names=FALSE)
> }
> sink()
>
> And an example of my data structure is like the following:
>
> CompanyID_DirectorID_YearDirectorID_Connected
> 900370006802120033699838021
> 900370041803220033699838021
> 900370059803220033699838021
> 900370089803220033699838021
> 900370347806320033699838021
> 900370362806320033699838021
> 900370383806320033699838021
> 900370399806320033699838021
> 900369983802120033700068021
> 900370041803220033700068021
> 900370059803220033700068021
> 900370089803220033700068021
> 900370347806320033700068021
> 900370362806320033700068021
> 900370383806320033700068021
> 900370399806320033700068021
> 900369983802120033700418032
> 900370006802120033700418032
> 900370059803220033700418032
> 900370006802120043699838021
> 900370041803220043699838021
> 900370059803220043699838021
> 900370089803220043699838021
> 900370347806320043699838021
> 129016045381142003427207466
> 129035569064722003427207466
> 129037011080322003427207466
> 129037084581042003427207466
> 129037084781042003427207466
> 129037112481352003427207466
> 1290101671106122003427207466
> 1290102718113832003427207466
>
> where for each firm-year I have a list of directors and their corresponding 
> connected directors within that firm-year.
>
> If you could
> provide me the R code regarding to how to calculate the weighted measures by 
> network size that that would be really
> helpful.
>
> I cannot be more than appreciated.
>
> Best regards
>
> Jenny
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using axis limits with plot3D

2014-08-07 Thread Karline Soetaert
Hi Scott,

The trick is to postpone plotting (plot = FALSE) and then do plotdev() with the 
required limits:

library(plot3D)
x <- z <- seq(-4, 4, by=0.2)
y <- seq(-6, 6, by=0.2)
M <- mesh(x,y,z)
R <- with(M, sqrt(x^2 + y^2 +z^2))
p <- sin(2*R)/(R+1e-3)
x.limits <- c(-2, 2)
y.limits <- c(-2, 2)
slice3D(x,y,z, colvar=p, xs=0, ys=c(0, 4), zs=NULL, scale=F, 
ticktype="detailed", plot = FALSE) 

plotdev(xlim=x.limits, ylim=y.limits)


This should work.

Karline

Message: 25
Date: Wed, 6 Aug 2014 23:21:04 +
From: "Waichler, Scott R" 
To: "R. Help" 
Cc: "karline.soeta...@nioz.nl" 
Subject: [R] Using axis limits with plot3D
Message-ID:
<074c83dad4825242a20b2d83fdbcb888765...@ex10mbox03.pnnl.gov>
Content-Type: text/plain; charset="us-ascii"

I would like to use the functions in the plot3D package but I am having trouble 
getting the axis limits to work correctly.  The slices plotted by the code 
below go beyond the bounds of the persp box and obscure the axis information.  
How can I show just the part of the domain within x.limits and y.limits?

library(plot3D)
x <- z <- seq(-4, 4, by=0.2)
y <- seq(-6, 6, by=0.2)
M <- mesh(x,y,z)
R <- with(M, sqrt(x^2 + y^2 +z^2))
p <- sin(2*R)/(R+1e-3)
x.limits <- c(-2, 2)
y.limits <- c(-2, 2)
slice3D(x,y,z, colvar=p, xs=0, ys=c(0, 4), zs=NULL, xlim=x.limits, 
ylim=y.limits, scale=F, ticktype="detailed")

Thanks,
Scott Waichler
Pacific Northwest National Laboratory
Richland, WA, USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Logical operators and named arguments

2014-08-07 Thread Ryan
Josh,

Thank you for your detailed answer.

Best,
Ryan

On 7 Aug 2014, at 16:21, Joshua Wiley wrote:

> Hi Ryan,
>
> It does work, but the *apply family of functions always pass to the first
> argument, so you can specify e2 = , but not e1 =.  For example:
>
>> sapply(1:3, `>`, e2 = 2)
> [1] FALSE FALSE  TRUE
>
> From ?sapply
>
>'lapply' returns a list of the same length as 'X', each element of
>which is the result of applying 'FUN' to the corresponding element
>of 'X'.
>
> so `>` is applied to each element of 1:3
>
> `>`(1, ...)
> `>`(2, ...)
> `>`(3, ...)
>
> and if e2 is specified than that is passed
>
> `>`(1, 2)
> `>`(2, 2)
> `>`(3, 2)
>
> Further, see ?Ops
>
>  If the members of this group are called as functions, any
> argument names are removed to ensure that positional matching
> is always used.
>
> and you can see this at work:
>
>> `>`(e1 = 1, e2 = 2)
> [1] FALSE
>> `>`(e2 = 1, e1 = 2)
> [1] FALSE
>
> If you want to the flexibility to specify which argument the elements of X
> should be *applied to, use a wrapper:
>
>> sapply(1:3, function(x) `>`(x, 2))
> [1] FALSE FALSE  TRUE
>> sapply(1:3, function(x) `>`(2, x))
> [1]  TRUE FALSE FALSE
>
>
> HTH,
>
> Josh
>
>
>
> On Thu, Aug 7, 2014 at 2:20 PM, Ryan  wrote:
>
>> Hi,
>>
>> I'm wondering why calling ">" with named arguments doesn't work as
>> expected:
>>
>>> args(">")
>> function (e1, e2)
>> NULL
>>
>>> sapply(c(1,2,3), `>`, e2=0)
>> [1] TRUE TRUE TRUE
>>
>>> sapply(c(1,2,3), `>`, e1=0)
>> [1] TRUE TRUE TRUE
>>
>> Shouldn't the latter be FALSE?
>>
>> Thanks for any help,
>> Ryan
>>
>>
>> The information in this e-mail is intended only for t...{{dropped:28}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ask for help

2014-08-07 Thread jim holtman
rle

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Aug 6, 2014 at 10:12 PM, Johnnycz  wrote:
> Hello,everybody,
>  I have a sequence,like 
> a<-c(1,1,1,0,0,1,1,1,1,1,1,0,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1),how 
> to get the position of each first 1 and 0, that's to say, how to get 
> b<-c(1,6,16,23) for first 1 and d<-c(4,12,18) for first 0.
> Many thanks!
> Johnny
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] citation() command doesn't work in R.3.1.1

2014-08-07 Thread Martin Maechler
> Igor Sosa Mayor 
> on Wed, 6 Aug 2014 18:13:56 +0200 writes:

> Sverre Stausland  writes:
>>> citation()
>> Error: $ operator is invalid for atomic vectors In
>> addition: Warning message: In packageDescription(pkg =
>> package, lib.loc = dirname(dir)) : no package 'base' was
>> found

> strange... I think something is wrong with the
> compilation, since as far as I know, the package `base'
> should be loaded automatically by starting R...

Even more: 
R cannot be working at all without the base package. It's not a
package you can be without, ever.
So indeed, the error message maybe a teeny tiny tad misleading.
But to reiterate: It's only the broken state of your R,
not at all R 3.1.1  with this problem.

Please do note that we have tens of thousands of tests that must
have run without fault before R is released.  Calling citation()
and >= 99% of all other base R functions is among these quality
assurance tests of an R release.

Martin Maechler (R core team)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.