Re: [R] Plotting sigma symbol with unicode and turning into pdf

2009-08-12 Thread Dieter Menne



Jonathan R. Blaufuss wrote:
> 
> 
> set.seed(1) 
> Data=rnorm(100,sd=1) 
> plot(density(Data)) 
> text(25000,0.4,
>   paste("\u03c3 = ",
>   format(round(sd(Data),digits=3),big.mark=",")),
>   font=2, col="blue")
> 

That example gives a latin "s" in my Windows system. Using font=5 it works.
The "standard references" are the following:

Ben Bolker's digest of Brian Ripley:
http://wiki.r-project.org/rwiki/doku.php?id=tips:graphics-misc:symbols&s=unicode

Brian Ripley
http://markmail.org/message/kzjts7zbxmluhuqy


Dieter

-- 
View this message in context: 
http://www.nabble.com/Plotting-sigma-symbol-with-unicode-and-turning-into-pdf-tp24945989p24949583.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Slicing cra**y csv files

2009-08-12 Thread Patrick Connolly
On Wed, 12-Aug-2009 at 03:36AM -0700, jorgusch wrote:

|> 
|> First of all, sorry for not giving all information.
|> 
|> Secondly, thanks a lot. This is a real help!! I did not know, that you can
|> use names...
|> This is really simple and works great!!!
|> 
|> If anyone is close enough to the people writing the help in R, please tell
|> them that they should write a tutorial for such scenarios. 

Well, do you understand how it works?  It's not much like the standard
use of dataframes.  It just so happens that a feature of dataframes
and the way they're usually read into R could be used for your task.
Did you notice that the method creates a dataframe with no rows?
Generally not much use, but it was in your case.  It would be an
astounding tutorial writer who managed to think that such a use would
be useful for someone in, as the airlines say, such a rare event.
I've never had use of it myself.

If I devised a way of using, say pliers, to open a jar, would it be
fair to expect the manufacturers of the pliers to put that into the
accompanying instructions?


|> I mean, R can be as fancy and amazing statistical programm as it
|> is, but not qgetting the data in properly in first place, makes it
|> kind of useless.

It has loads of methods of getting the data into the form for
analysis, and, in any case, you can always get your money back if
you're not satisfied.




|> 
|> jorgusch
|> 
|> 
|> 
|> Patrick Connolly-4 wrote:
|> > 
|> > On Tue, 11-Aug-2009 at 01:39AM -0700, jorgusch wrote:
|> > 
|> > |> 
|> > |> Hello,
|> > |> 
|> > |> For not too regular users of R, preparing the data is somehow a burden.
|> > |> 
|> > |> Comming from iMacro in FireFox I get a badly designed csv, which I need
|> > to
|> > |> put into a daily R script. 
|> > |> The data looks like that (e.g.):
|> > 
|> > 
|> > How did you get from here
|> > 
|> > |> 22 Results,"35 Results","39 Results","2 Results","7 Results","23
|> > |> Results","42 Results","36 Results","22 Results","28 Results"
|> > |> 
|> > |> and R does this to it:
|> > 
|> > 
|> > to here?
|> > 
|> > |>  V1 V2 V3V4V5 V6

|> > |> V7
|> > |> 1 22 Results 35 Results 39 Results 2 Results 7 Results 23 Results 42
|> > |> Results
|> > |>   V8 V9V10
|> > |> 1 36 Results 22 Results 28 Results
|> > 
|> > It's probably easy enough to do but we don't have anything repeatable to
|> > use.
|> > 
|> > If I make a csv file from the text string and call it junk.csv, I can
|> > get a vector of numbers like this:
|> > 
|> >> as.numeric(gsub("[A-z.]", "", names(read.csv("junk.csv"
|> >  [1]  22  35  39   2   7  23  42  36 221  28
|> >> 
|> > 
|> > But there's probably more general ways if we knew more about your
|> > position.  It's likely you could use the clipboard instead of the
|> > junk.csv text file.
|> > 
|> > HTH
|> > 
|> > 
|> > 
|> > |> 
|> > |> I just need the numbers as a vector.
|> > |> 
|> > |> Excel can do it with a few lines of VBA, but there must be a way to do
|> > it
|> > |> directly in R, would make things easier.
|> > |> 
|> > |> Thanks a lot!
|> > |>  jorgusch
|> > |> -- 
|> > |> View this message in context:
|> > http://www.nabble.com/Slicing-cra**y-csv-files-tp24913849p24913849.html
|> > |> Sent from the R help mailing list archive at Nabble.com.
|> > |> 
|> > |> __
|> > |> R-help@r-project.org mailing list
|> > |> https://stat.ethz.ch/mailman/listinfo/r-help
|> > |> PLEASE do read the posting guide
|> > http://www.R-project.org/posting-guide.html
|> > |> and provide commented, minimal, self-contained, reproducible code.
|> > 
|> > -- 
|> > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
|> >___Patrick Connolly   
|> >  {~._.~}   Great minds discuss ideas
|> >  _( Y )_Average minds discuss events 
|> > (:_~*~_:)  Small minds discuss people  
|> >  (_)-(_) . Eleanor Roosevelt
|> >  
|> > ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
|> > 
|> > __
|> > R-help@r-project.org mailing list
|> > https://stat.ethz.ch/mailman/listinfo/r-help
|> > PLEASE do read the posting guide
|> > http://www.R-project.org/posting-guide.html
|> > and provide commented, minimal, self-contained, reproducible code.
|> > 
|> > 
|> 
|> -- 
|> View this message in context: 
http://www.nabble.com/Slicing-cra**y-csv-files-tp24913849p24933830.html
|> Sent from the R help mailing list archive at Nabble.com.
|> 
|> __
|> R-help@r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patri

Re: [R] Proper / Improper scoring Rules

2009-08-12 Thread Daniel Malter
results=c(0.31,0.36,0.33)
names(results)=c("y=good","y=better","y=best")
results

names(results)[results==max(results)]
which(names(results)==(names(results)[results==max(results)]))

More generally, however, avoid protected operators in your variable names
(like the equality sign)! Rather choose something like y.good, y.better,
y.best, or whatever you like as variable names.

HTH,
Daniel 


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von Donald Catanzaro, PhD
Gesendet: Thursday, August 13, 2009 2:04 AM
Cc: r-help@r-project.org
Betreff: Re: [R] Proper / Improper scoring Rules

Hi All,

I have done more background research (including Frank's book) so I feel that
my second question is answered.  However, as a novice R user I still have
the following problem, accessing the output of predict.  So simplifying my
question, using the example provided in the Design package
(http://lib.stat.cmu.edu/S/Harrell/help/Design/html/predict.lrm.html) I
might do something like:

> # See help for predict.Design for several binary logistic # regression 
> examples
> 
> # Examples of predictions from ordinal models
> set.seed(1)
> y <- factor(sample(1:3, 400, TRUE), 1:3, c('good','better','best'))
> x1 <- runif(400)
> x2 <- runif(400)
> f <- lrm(y ~ rcs(x1,4)*x2)
> predict(f, type="fitted.ind")[1:10,]   #gets Prob(better) and all others
  y=good  y=bettery=best
1  0.3124704 0.3631544 0.3243752
2  0.3676075 0.3594685 0.2729240
3  0.2198274 0.3437416 0.4364309
4  0.3063463 0.3629658 0.3306879
5  0.5171323 0.3136088 0.1692590
6  0.3050115 0.3629071 0.3320813
7  0.3532452 0.3612928 0.2854620
8  0.2933928 0.3621220 0.3444852
9  0.3068595 0.3629867 0.3301538
10 0.6214710 0.2612164 0.1173126
> d <- data.frame(x1=.5,x2=.5)
> predict(f, d, type="fitted")# Prob(Y>=j) for new observation
y>=better   y>=best 
0.6906593 0.3275849 
> predict(f, d, type="fitted.ind")# Prob(Y=j)
   y=good  y=bettery=best 
0.3093407 0.3630744 0.3275849 


So now if I wanted to do

> out <- predict(f, d, type="fitted.ind")>

> out

   y=good  y=bettery=best 

0.3093407 0.3630744 0.3275849 

> out$"y=better"

Error in out$"y=better" : $ operator is invalid for atomic vectors

> 


y=better is the max, so how do I create something that says that ? 
(which is not exactly what I want to do but close enough to help me figure
out what R code I need to accomplish the task)

I can push the predictions out to a vector:

out.vector <- as.vector(predict(f, d, type="fitted.ind"))

> out.vector

[1] 0.3093407 0.3630744 0.3275849


which gets me part of the way because I can find out max(out.vector) but I
still need to know what column the max is in. I think the problem is that I
don't know how to manipulate data frames and vectors in R and need some
guidance

-Don 

Don Catanzaro, PhD  
Landscape Ecologist
dgcatanz...@gmail.com
16144 Sigmond Lane
Lowell, AR 72745
479-751-3616



Frank E Harrell Jr wrote:
> Donald Catanzaro, PhD wrote:
>> Hi All,
>>
>> I am working on some ordinal logistic regresssions using LRM in the 
>> Design package.  My response variable has three categories (1,2,3) 
>> and after using the creating my model and using a call to predict 
>> some values and I wanted to use a simple .5 cut-off to classify my 
>> probabilities into the categories.
>>
>> I had two questions:
>>
>> a)  first, I am having trouble directly accessing the probabilities 
>> which may have more to do with my lack of experience with R
>>
>> For instance, my calls
>>
>>  >ologit.three.NoPerFor <- lrm(Threshold.Three ~ TECI , data=CLD,
>> na.action=na.pass)
>>  >CLD$Threshold.Predict.Three.NoPerFor<-
>> predict(ologit.three.NoPerFor, newdata=CLD, type="fitted.ind")
>>  
>> >CLD$Threshold.Predict.Three.NoPerFor.Cats[CLD$Threshold.Predict.Thre
>> e.NoPerFor.Threshold.Three=1
>>  > .5] <- 1
>> Error: unexpected '=' in
>>
"CLD$Threshold.Predict.Three.NoPerFor.Cats[CLD$Threshold.Predict.Three.NoPer
For.Threshold.Three=" 
>>
>>  >
>>  >
>>
>> produce an error message and it seems as R does not like the equal 
>> sign at all.  So how does one access the probabilities so I can 
>> classify them into the categories of 1,2,3 so I can look at 
>> performance of my model ?
>
> use == to check equality
>
>>
>> b)  which leads me to my next question.  I thought that simply 
>> calculating the percent correct off of my predictions would be 
>> sufficient to look at performance but since my question is very much 
>> in line with this thread 
>> http://tolstoy.newcastle.edu.au/R/e4/help/08/04/8987.html I am not so 
>> sure anymore.  I am afraid I did not understand Frank Harrell's last 
>> suggestion regarding improper scoring rule - can someone point me to 
>> some internet resources that I might be able to review to see why my 
>> approach would not be valid ?
>
> Percent correct will give you misleading a

Re: [R] what is the difference between the two logistic models?

2009-08-12 Thread Daniel Malter
As I wrote in my previous email, you need to pick up a methods book that
deals with an introduction to regression analysis.

Using factors in R means using dummy variable coding. 

The coefficients estimated in your model using factors indicate, the effect
of teaching.method = 2 in comparison to the effect of teaching method = 1
and the effect of teaching.method = 3 in comparison to the effect of
teaching method = 3.

Using the linear term, as you do in your second model, is definetely wrong
for teaching.method, unless the teaching.method(s) differ only in the hours
taught. The model says, as teaching method increases by 1, the linear
predictor in the logistic model increases by 0.28. This is obviously bogus
if the difference for the values assigned to the levels of teaching.method
are non-informative about quantitative differences in teaching.method(s). To
give a very plastic example: Say a table is green, red, or brown and you
assign values 1, 2, and 3 to the colors. What to the numbers tell you? -
nothing! The difference between green, red, and brown tables are
qualitative. Therefore, the numeric differences in the coding of the color
variable are non-informative. You cannot use such variables as linear terms
in a regression model. 

In your previous post, it seemed that teaching method is perfectly collinear
with teaching hours. If that is the case, you may want to consider to code
your dummy variable as orthogonal polynomial contrasts. But do so only if
a.) there is no qualitative difference between teaching methods and the only
difference is the quantitative difference in the hours taught and b.) you
are actually able to interpret your model. 

However, I grasp that your understanding of regressions is quite limited.
Therefore, your initial goal should be to build models that you can
understand and interpret.

Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von SNN
Gesendet: Wednesday, August 12, 2009 6:05 PM
An: r-help@r-project.org
Betreff: [R] what is the difference between the two logistic models?




Hi All,


I have data with 400 individuals and the following information
Grade: pass or fail  coded as 1 for pass and 0 for fail
Sex: male or female ( coded as 1 for male and 2 for female ) Age
Teaching.method : can be  1,2,3 

I want to fit a logistic regression where the outcome if (1=pass or 0 for
fail) and the rest of the variables are the regressors. 
My question is that I am not sure when to use “factor” for a variable.

In my example, Grade, sex, teaching method are categorial variables coded as
stated above.
Age is a continuous variable


I have tried the model both ways where in the first model I stick in the
word “factor” in front of the categorial variables, but in this case I do
not know how to interpret the output?

Can someone shed some light on the difference between model1 and model2 and
how to interpret them?
  

Below is my output

Thanks for your help




Call:
glm(formula = factor(Grade) ~ factor(sex) + age + factor(teaching.method), 
family = binomial, data = data)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.8649  -1.1926   0.7494   1.0091   1.6659  

Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept)-2.772170.82182  -3.373 0.000743
***
factor(sex)2   -0.347510.22960  -1.514 0.130140

age  0.045440.01074   4.230
2.34e-05 ***
factor(teaching.method)  2-0.071250.30123  -0.237 0.813023
factor(teaching.method)3 0.500580.33087   1.513 0.130303
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 465.18  on 344  degrees of freedom Residual deviance:
438.91  on 340  degrees of freedom
AIC: 448.91

Number of Fisher Scoring iterations: 4


> model2<-glm(Grade~ sex + age +teaching.method, 
> family=binomial,data=ndata)
> summary(model2)

Call:
glm(formula = Grade ~ sex + age +teaching.method, family = binomial, 
data = ndata)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.7959  -1.2122   0.7547   1.0043   1.5791  

Coefficients:
 Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.839880.94749  -2.997  0.00272 ** 
sex-0.333610.22867  -1.459  0.14458
age   0.044320.01065   4.160 3.18e-05 ***
teaching.method 0.280170.16181   1.731  0.08336 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 465.18  on 344  degrees of freedom Residual deviance:
440.85  on 341  degrees of freedom
AIC: 44

Re: [R] Proper / Improper scoring Rules

2009-08-12 Thread Donald Catanzaro, PhD

Hi All,

I have done more background research (including Frank's book) so I feel 
that my second question is answered.  However, as a novice R user I 
still have the following problem, accessing the output of predict.  So 
simplifying my question, using the example provided in the Design 
package 
(http://lib.stat.cmu.edu/S/Harrell/help/Design/html/predict.lrm.html) I 
might do something like:



# See help for predict.Design for several binary logistic
# regression examples

# Examples of predictions from ordinal models
set.seed(1)
y <- factor(sample(1:3, 400, TRUE), 1:3, c('good','better','best'))
x1 <- runif(400)
x2 <- runif(400)
f <- lrm(y ~ rcs(x1,4)*x2)
predict(f, type="fitted.ind")[1:10,]   #gets Prob(better) and all others

 y=good  y=bettery=best
1  0.3124704 0.3631544 0.3243752
2  0.3676075 0.3594685 0.2729240
3  0.2198274 0.3437416 0.4364309
4  0.3063463 0.3629658 0.3306879
5  0.5171323 0.3136088 0.1692590
6  0.3050115 0.3629071 0.3320813
7  0.3532452 0.3612928 0.2854620
8  0.2933928 0.3621220 0.3444852
9  0.3068595 0.3629867 0.3301538
10 0.6214710 0.2612164 0.1173126

d <- data.frame(x1=.5,x2=.5)
predict(f, d, type="fitted")# Prob(Y>=j) for new observation
y>=better   y>=best 
0.6906593 0.3275849 

predict(f, d, type="fitted.ind")# Prob(Y=j)
  y=good  y=bettery=best 
0.3093407 0.3630744 0.3275849 



So now if I wanted to do

out <- predict(f, d, type="fitted.ind")> 



out


  y=good  y=bettery=best 

0.3093407 0.3630744 0.3275849 


out$"y=better"


Error in out$"y=better" : $ operator is invalid for atomic vectors






y=better is the max, so how do I create something that says that ? 
(which is not exactly what I want to do but close enough to help me 
figure out what R code I need to accomplish the task)


I can push the predictions out to a vector:

out.vector <- as.vector(predict(f, d, type="fitted.ind"))


out.vector


[1] 0.3093407 0.3630744 0.3275849


which gets me part of the way because I can find out max(out.vector) but 
I still need to know what column the max is in. I think the problem is 
that I don't know how to manipulate data frames and vectors in R and 
need some guidance


-Don 

Don Catanzaro, PhD  
Landscape Ecologist

dgcatanz...@gmail.com
16144 Sigmond Lane
Lowell, AR 72745
479-751-3616



Frank E Harrell Jr wrote:

Donald Catanzaro, PhD wrote:

Hi All,

I am working on some ordinal logistic regresssions using LRM in the 
Design package.  My response variable has three categories (1,2,3) 
and after using the creating my model and using a call to predict 
some values and I wanted to use a simple .5 cut-off to classify my 
probabilities into the categories.


I had two questions:

a)  first, I am having trouble directly accessing the probabilities 
which may have more to do with my lack of experience with R


For instance, my calls

 >ologit.three.NoPerFor <- lrm(Threshold.Three ~ TECI , data=CLD, 
na.action=na.pass)
 >CLD$Threshold.Predict.Three.NoPerFor<- 
predict(ologit.three.NoPerFor, newdata=CLD, type="fitted.ind") 
 >CLD$Threshold.Predict.Three.NoPerFor.Cats[CLD$Threshold.Predict.Three.NoPerFor.Threshold.Three=1 
 > .5] <- 1
Error: unexpected '=' in 
"CLD$Threshold.Predict.Three.NoPerFor.Cats[CLD$Threshold.Predict.Three.NoPerFor.Threshold.Three=" 


 >
 >

produce an error message and it seems as R does not like the equal 
sign at all.  So how does one access the probabilities so I can 
classify them into the categories of 1,2,3 so I can look at 
performance of my model ?


use == to check equality



b)  which leads me to my next question.  I thought that simply 
calculating the percent correct off of my predictions would be 
sufficient to look at performance but since my question is very much 
in line with this thread 
http://tolstoy.newcastle.edu.au/R/e4/help/08/04/8987.html I am not so 
sure anymore.  I am afraid I did not understand Frank Harrell's last 
suggestion regarding improper scoring rule - can someone point me to 
some internet resources that I might be able to review to see why my 
approach would not be valid ?


Percent correct will give you misleading answers and is game-able.  It 
is also ultra-high-variance.  Though not a truly proper scoring rule, 
Somers' Dxy rank correlation (generalization of ROC area) is helpful. 
Better still: use the log-likelihood and related quantities (deviance, 
adequacy index as described in my book).


Frank









__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logistic regression

2009-08-12 Thread Daniel Malter
This reads as if you need to pick up a methods book on regression more
generally. My guess is that "teaching method" is perfectly collinear with
"TotalHours." Therefore, your model matrix is rank deficient. That is, if it
is the case that teaching method=(1), (2), and (3) imply total hours=(0),
(1), and (2), respectively, then, obviously, the effects of teaching and
total hours are not discernable. R will automatically drop such collinear
variables, which is most likely the reason for you getting NA results.
Include either and you will get a (the same) result.

How to investigate if this is the reason: do

table(teaching.method,TotalHours)

If this outputs a diagonal matrix (a matrix with all zeros off the main
diagonal), then the reason for the NAs is the perfect collinearity between
teaching.method and TotalHours

Further, you might want to include one of these variables as factors/dummies
(or even factors coded as polynomial orthogonal contrasts), which is another
reason to pick up a book on the topic.

HTH,
Daniel


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von SNN
Gesendet: Wednesday, August 12, 2009 5:33 PM
An: r-help@r-project.org
Betreff: [R] logistic regression


Hi All,

I have data with 400 individuals and the following information
Grade: pass or fail
Sex: male or female
Age
Teaching.metho : can be  1,2,3
TotalHours: can be 0,1,2

I want to fit a logistic regression and for the TotalHours I am getting
nothing! What could be the reason. What does the following message mean ?
 [Coefficients: (1 not defined because of singularities)] 

Below is my output

Thanks for your help



Call:
glm(formula = Grade~ sex + age + teaching.method+ TotalHours, 
family = binomial, data = data)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.3844  -0.9686  -0.7688   1.2304   1.8871  

Coefficients: (1 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)   
(Intercept) -3.621781.20480  -3.006  0.00265 **
sex -0.327090.28539  -1.146  0.25175   
age  0.044050.01371   3.213  0.00132 **
teaching.method   0.238780.20553   1.162  0.24533   
TotalHours NA NA  NA   NA   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 297.38  on 223  degrees of freedom Residual deviance:
282.93  on 220  degrees of freedom
AIC: 290.93


--
View this message in context:
http://www.nabble.com/logistic-regression-tp24943431p24943431.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] psi not functioning in nlrob?

2009-08-12 Thread Xiao Xiao
Thank you Keo!
After installing MASS the default "psi=psi.huber" is working now.
However I still can't get "psi=psi.bisquare" to work, and here's
another error message:
> model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare, 
> start=list(a1=0.02,a2=0.7),maxit=1000)
Error in na.fail.default(list(y = c(71.2600034232749, 148.175742933206,  :
  missing values in object

I don't know why there are missing values, I'm sure y is the right
length and there are no NAs in it. Could somebody help me with this
one please?

Thanks in advance,
Xiao
On Wed, Aug 12, 2009 at 11:47 PM, Keo
Ormsby wrote:
> You have to install MASS package first.
> Hope this does the trick.
> Best,
> Keo.
>
> Xiao Xiao wrote:
>>
>> Hi all,
>>
>> I'm trying to fit a nonlinear regression by "nlrob":
>>
>>  model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare,
>> start=list(a1=0.02,a2=0.7),maxit=1000)
>>
>> However an error message keeps popping up saying that the function
>> psi.bisquare doesn't exist.
>>
>> I also tried psi.huber, which is supposed to be the default for nlrob:
>>
>> model3=nlrob(y~a1*x^a2,data=transient,psi=psi.huber,
>> start=list(a1=0.02,a2=0.7),maxit=1000)
>>
>> But I still got the same error message - psi.huber doesn't exist.
>>
>> Is the argument "psi" not available in nlrob?
>>
>> Any help will be appreciated.
>>
>> Best,
>> Xiao Xiao
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] psi not functioning in nlrob?

2009-08-12 Thread Keo Ormsby

You have to install MASS package first.
Hope this does the trick.
Best,
Keo.

Xiao Xiao wrote:

Hi all,

I'm trying to fit a nonlinear regression by "nlrob":

 model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare,
start=list(a1=0.02,a2=0.7),maxit=1000)

However an error message keeps popping up saying that the function
psi.bisquare doesn't exist.

I also tried psi.huber, which is supposed to be the default for nlrob:

model3=nlrob(y~a1*x^a2,data=transient,psi=psi.huber,
start=list(a1=0.02,a2=0.7),maxit=1000)

But I still got the same error message - psi.huber doesn't exist.

Is the argument "psi" not available in nlrob?

Any help will be appreciated.

Best,
Xiao Xiao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] REMOVE ME

2009-08-12 Thread Mehdi Khan
You can do it yourself by unsubscribing.

On Wed, Aug 12, 2009 at 8:48 PM, Tim Paysen  wrote:

> This mailing list is too intrusive.  Remove my name.
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what is the difference between the two logistic models?

2009-08-12 Thread SNN



Hi All,


I have data with 400 individuals and the following information 
Grade: pass or fail  coded as 1 for pass and 0 for fail 
Sex: male or female ( coded as 1 for male and 2 for female ) 
Age 
Teaching.method : can be  1,2,3 

I want to fit a logistic regression where the outcome if (1=pass or 0 for
fail) and the rest of the variables are the regressors. 
My question is that I am not sure when to use “factor” for a variable.

In my example, Grade, sex, teaching method are categorial variables coded as
stated above.
Age is a continuous variable


I have tried the model both ways where in the first model I stick in the
word “factor” in front of the categorial variables, but in this case I do
not know how to interpret the output?

Can someone shed some light on the difference between model1 and model2 and
how to interpret them?
  

Below is my output

Thanks for your help




Call:
glm(formula = factor(Grade) ~ factor(sex) + age + factor(teaching.method), 
family = binomial, data = data)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.8649  -1.1926   0.7494   1.0091   1.6659  

Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept)-2.772170.82182  -3.373 0.000743
***
factor(sex)2   -0.347510.22960  -1.514 0.130140
age  0.045440.01074   4.230
2.34e-05 ***
factor(teaching.method)  2-0.071250.30123  -0.237 0.813023
factor(teaching.method)3 0.500580.33087   1.513 0.130303
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 465.18  on 344  degrees of freedom
Residual deviance: 438.91  on 340  degrees of freedom
AIC: 448.91

Number of Fisher Scoring iterations: 4


> model2<-glm(Grade~ sex + age +teaching.method, family=binomial,data=ndata)
> summary(model2)

Call:
glm(formula = Grade ~ sex + age +teaching.method, family = binomial, 
data = ndata)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.7959  -1.2122   0.7547   1.0043   1.5791  

Coefficients:
 Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.839880.94749  -2.997  0.00272 ** 
sex-0.333610.22867  -1.459  0.14458
age   0.044320.01065   4.160 3.18e-05 ***
teaching.method 0.280170.16181   1.731  0.08336 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 465.18  on 344  degrees of freedom
Residual deviance: 440.85  on 341  degrees of freedom
AIC: 448.85

Number of Fisher Scoring iterations: 4


-- 
View this message in context: 
http://www.nabble.com/what-is-the-difference-between-the-two-logistic-models--tp24943440p24943440.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logistic regression

2009-08-12 Thread SNN

Hi All,

I have data with 400 individuals and the following information 
Grade: pass or fail 
Sex: male or female 
Age 
Teaching.metho : can be  1,2,3 
TotalHours: can be 0,1,2

I want to fit a logistic regression and for the TotalHours I am getting
nothing! What could be the reason. What does the following message mean ?
 [Coefficients: (1 not defined because of singularities)] 

Below is my output

Thanks for your help



Call:
glm(formula = Grade~ sex + age + teaching.method+ TotalHours, 
family = binomial, data = data)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.3844  -0.9686  -0.7688   1.2304   1.8871  

Coefficients: (1 not defined because of singularities)
Estimate Std. Error z value Pr(>|z|)   
(Intercept) -3.621781.20480  -3.006  0.00265 **
sex -0.327090.28539  -1.146  0.25175   
age  0.044050.01371   3.213  0.00132 **
teaching.method   0.238780.20553   1.162  0.24533   
TotalHours NA NA  NA   NA   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 297.38  on 223  degrees of freedom
Residual deviance: 282.93  on 220  degrees of freedom
AIC: 290.93


-- 
View this message in context: 
http://www.nabble.com/logistic-regression-tp24943431p24943431.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tcltk in BATCH mode

2009-08-12 Thread Angel Spassov
DeaR list,

I am writing a user friendly script for an interactive use of R. The user is
supposed to click on a bat-File in Windows-environment and to enter some
values in a pop-up window.
Let us assume we have the script given below. If she sources this script
within R, everything works smoothly. The pop-up window jumps and patiently
waits her to enter the value. For this to work she has to start R, navigate
to File --> Source R code... and to find her R-Script file (or whichever we
sources the file). I want the things to be a bit more automatic - a single
click and everything goes. Let us assume the script lives in "test.R". I
made the bat-Files with the following entries:

"D:\Program Files\R\R-2.9.1\bin\Rscript.exe"
test.R

or another bat file like this one

"D:\Program Files\R\R-2.9.1\bin\Rcmd.exe" BATCH test.R

Both of them seem to work, but they simply execute the "test.R"-without
waiting the user to enter anything. I see the window popping-up and being
immediately closed afterwards. I tested almost every binary in
"...\R\R-2.9.1\bin\..." (Rcmd.exe, R.exe, Rterm.exe, Rscript.exe, etc). In
case I made myself clear, could someone give me a hint of how accomplish
this task.

Cheers,
AS


 BEGIN R Scipt  
require(tcltk)
tt <- tktoplevel()
tktitle(tt) <- "My Schedular"
Name <- tclVar("")
entry.Name <-tkentry(tt,width="18",textvariable=Name)
tkgrid(tklabel(tt,text="Please enter your name"))
tkgrid(entry.Name)

OnOK <- function() {
  NameVal <- tclvalue(Name)
  tkdestroy(tt)
  msg <- paste("You have a nice name",NameVal)
  tkmessageBox(title="Result",message=msg)
  }
}

OK.but <-tkbutton(tt,text="OK   ",command = OnOK)
tkbind(entry.Name, "",OnOK)
tkgrid(OK.but)
tkfocus(tt)

 END R Scipt 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Map of UK Counties - to use in R

2009-08-12 Thread Raoul

Thanks a million Roger! This works well. All I now need to do is to figure
how I can plot data from a csv file onto the map. I really appreciate your
assistance!
Regards,
Raoul


Roger Bivand wrote:
> 
> The illustration you show is for the so-called traditional or historical
> counties of England, which may be available somewhere. There are
> non-georeferenced PNG files on Wikipedia, which might be used, but as far
> as I can see, only UK-based academics can register for access to the edina
> UK borders datasets.
> 
> One possibility is to use the 2006 NUTS boundaries shapefile from
> GISCO/EUROSTAT at:
> 
> http://epp.eurostat.ec.europa.eu/portal/page/portal/gisco/geodata/reference
> 
> http://epp.eurostat.ec.europa.eu/cache/GISCO/geodatafiles/NUTS_03M_2006_SH.zip
> 
> and in R using something like:
> 
> library(rgdal)
> RG <- readOGR(".", "NUTS_RG_03M_2006")
> names(RG)
> UK <- grep("^UK", RG$NUTS_ID)
> RG_UK <- RG[UK,]
> plot(RG_UK, axes=TRUE)
> summary(RG_UK)
> 
> You'll then need to find the regions you want, possibly from:
> 
> http://www.statistics.gov.uk/geography/nuts.asp
> 
> so that you can retain only England, and choose the NUTS* boundaries that
> suit your "counties" - which are not presently well-defined because of
> boundary and administrative changes. The GISCO shapefile is in
> geographical coordinates, so you'll be able to overplot points by
> longitude and latitude.
> 
> Hope this helps,
> 
> Roger Bivand
> 
> 
> Raoul wrote:
>> 
>> Hi,
>> Can anyone help me with either of these:
>> 1) Map of the UK counties that I could use in R?
>> 2) How could I use an existing map for example, a map from here
>> http://www.itraveluk.co.uk/maps/england.html - in R. I need to use a UK
>> map to plot locations on it by lat & long.
>> 
>> Would appreciate help on any of these.
>> Thanks,
>> Raoul
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Map-of-UK-Counties---to-use-in-R-tp24930435p24948470.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] REMOVE ME

2009-08-12 Thread Jorge Ivan Velez
Dear Tim,
You can do this yourself at
https://stat.ethz.ch/mailman/listinfo/r-help (scroll
down).

HTH,

Jorge


On Wed, Aug 12, 2009 at 11:48 PM, Tim Paysen <> wrote:

> This mailing list is too intrusive.  Remove my name.
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] REMOVE ME

2009-08-12 Thread stephen sefick
remove yourself!

On Wed, Aug 12, 2009 at 10:48 PM, Tim Paysen wrote:
> This mailing list is too intrusive.  Remove my name.
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a column based on data in another column

2009-08-12 Thread Mehdi Khan
willsclahanstationcut$wills.vs30<-
recode(willsclahanstationcut$area.VSCAT,c("B"=686,"C"= 464,"BC"= 724,
"D"=301,"CD"= 372, "D"= 800, "DE"= 1000, "WATER"=0))

where $vs30 is the column I want to create, $area.VSCAT is the column which
contains the labels.


On Wed, Aug 12, 2009 at 8:46 PM, Mehdi Khan  wrote:

> Hey guys, I have the same question, except in reverse: if we had letter
> classifications and wanted to assign numerical values to them (the case I
> stated previously except the other way around) how would we do that?  I am
> trying to use the recode function in the car package but not having any
> luck..
>
> Thank you!!
>
> On Fri, Jul 31, 2009 at 12:04 PM, Mehdi Khan  wrote:
>
>> hello all,
>>
>> I have a data frame and I want to create a column which assigns a letter
>> based upon the value in another column.  The data column has velocities
>> ranging from 0 to 1000.  So for example, for velocities between 0 and 300
>> I'd like to assign the letter "A" in the new column, for 300-600, "B" and so
>> on and so forth.  How would I do this?
>>
>> Thank you very much!
>>
>> Mehdi Khan
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] REMOVE ME

2009-08-12 Thread Tim Paysen
This mailing list is too intrusive.  Remove my name.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a column based on data in another column

2009-08-12 Thread Mehdi Khan
Hey guys, I have the same question, except in reverse: if we had letter
classifications and wanted to assign numerical values to them (the case I
stated previously except the other way around) how would we do that?  I am
trying to use the recode function in the car package but not having any
luck..

Thank you!!

On Fri, Jul 31, 2009 at 12:04 PM, Mehdi Khan  wrote:

> hello all,
>
> I have a data frame and I want to create a column which assigns a letter
> based upon the value in another column.  The data column has velocities
> ranging from 0 to 1000.  So for example, for velocities between 0 and 300
> I'd like to assign the letter "A" in the new column, for 300-600, "B" and so
> on and so forth.  How would I do this?
>
> Thank you very much!
>
> Mehdi Khan
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Another Plotting Hint - changing fill color for points

2009-08-12 Thread Scott Sherrill-Mix
I don't really use Word or .wmf but maybe try a high pixel count .png e.g.
png('test.png',height=480*5,width=480*5,res=72*5)
plot(1:10, col = "red", bg = "grey", pch=21, cex =1.7)
dev.off()


Scott

Scott Sherrill-Mix
Department of Microbiology
University of Pennsylvania
402B Johnson Pavilion
3610 Hamilton Walk
Philadelphia, PA  19104-6076



On Wed, Aug 12, 2009 at 6:17 PM, Jason Rupert wrote:
> This worked great.
>
>
> Regarding the second question - can you expound a bit more on the effect of 
> the device?  Right now, as shown by the test code because I am on a Windows 
> machine and need to import the image to Word I am using WMF.  Is there a 
> better device that I should use that will help with the presolution of the 
> points that are drawn?  For example, I can see white pixesl on the blue 
> circles, and notice that is not a perfect circle.  Thanks for any information 
> about a better way to go:
>
> win.metafile(file=as.character(figure_file_name_and_path), pointsize = 10)
> plot(-4:4, -4:4, type = "n")# setting up coord. system
> points(vals_201, vals_200, col = "red", bg = "grey", pch=21, cex =1..7)
> #points(vals_201, vals_200, col = "grey", bg = "white", pch=21, cex =1.5)
> points(rnorm(100)/2, rnorm(100)/2, col = "blue", bg = "blue",  pch=21, cex 
> =1.5)
> dev.off()
>
>
> Thanks again.
>
> --- On Wed, 8/12/09, Sarah Goslee  wrote:
>
>> From: Sarah Goslee 
>> Subject: Re: [R] Another Plotting Hint - changing fill color for points
>> To: "Jason Rupert" , "r-help" 
>> Date: Wednesday, August 12, 2009, 4:56 PM
>> Yes, you can do that. You need to
>> specify pch in the range of 21-25,
>> and can then
>> specify both col and bg (background color). Oddly, the help
>> for this option is
>> under ?points rather than ?par or ?pch, but there are many
>> examples.
>>
>> Your second question would depend heavily on the device you
>> use and its
>> associated settings, but using the above solution should
>> solve your problem.
>>
>> Sarah
>>
>> On Wed, Aug 12, 2009 at 5:29 PM, Jason Rupert
>> wrote:
>> >
>> > Is it possible to change the fill color of a point?
>>  For example, the outer color being "Blue" and inner color
>> being "Grey".
>> > I've tried changing "col" and "bg", but that does not
>> seem to have the desired effect.
>> >
>> > Below is another attempt, but the pixel resolution of
>> the points function does not appear to be high enough for
>> this to work:
>> >
>> > figure_file_name_and_path<-paste("Test.wmf",
>> sep="")
>> >
>> > vals_200<-rnorm(200)
>> > vals_201<-rnorm(200)
>> >
>> >
>> win.metafile(file=as.character(figure_file_name_and_path),
>> pointsize = 10)
>> > plot(-4:4, -4:4, type = "n")# setting up coord.
>> system
>> > points(vals_201, vals_200, col = "blue", bg = "white",
>> pch=19, cex =1.7)
>> > points(vals_201, vals_200, col = "grey", bg = "white",
>> pch=19, cex =1.5)
>> > points(rnorm(100)/2, rnorm(100)/2, col = "blue", bg =
>> "white",  pch=19, cex =1.5)
>> > dev.off()
>> >
>> > As a second question, is there any way to increase the
>> pixel resolution of the points produced on the plot so that
>> they are perfect circles.  I just noticed that the fill
>> does not perfectly fill the points on the plot and there are
>> some pixels outside the circle.
>> >
>> > Thanks again.
>> >
>>
>>
>> --
>> Sarah Goslee
>> http://www.functionaldiversity.org
>>
>
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] to extract a section of the matrix

2009-08-12 Thread Daniel Malter
x=c("blah","blub","bleep","foo")
y=rnorm(4)

yourdata=data.frame(x,y)
yourdata

newdata=yourdata[order(y),]
newdata

thisiswhatyouwant=newdata[1:2,]
thisiswhatyouwant


hth,
daniel 


-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von Inchallah Yarab
Gesendet: Wednesday, August 12, 2009 11:53 AM
An: r-help@r-project.org
Betreff: [R] to extract a section of the matrix

HI,
i have this matrix and i want to extract  a section of the matrix which
present the first and second value of optim position:
   OptimYearLRPhase
PriorLRPosition OptimPositionLRPosition   
5  2009  1
1.15641679414676   0.330379845613571
6  2009  1
1.05365365779321   0.282266634171568  1
7  2009  1
0.959670688361124  0.199612536004054 1
8  2009  1
0.87403582415134   0.103774779139200  1
 
i wante this like result:

OptimYearLRPhase
PriorLRPosition OptimPositionLRPosition   
7  2009  1
0.959670688361124  0.199612536004054 1
8  2009  1
0.87403582415134   0.103774779139200  1

any ideas,

thank you!!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] to extract a section of the matrix

2009-08-12 Thread Inchallah Yarab
HI,
i have this matrix and i want to extract  a section of the matrix which present 
the first and second value of optim position:
   OptimYear    LRPhase  
PriorLRPosition     OptimPosition    LRPosition   
5  2009      1     
1.15641679414676   0.33037984561357    1
6  2009  
1 1.05365365779321   
0.282266634171568  1
7  2009      
1 0.959670688361124  
0.199612536004054 1
8  2009  
1 0.87403582415134   
0.103774779139200  1
 
i wante this like result:

OptimYear    LRPhase  
PriorLRPosition     OptimPosition    LRPosition   
7  2009      
1 0.959670688361124  
0.199612536004054 1
8  2009  
1 0.87403582415134   
0.103774779139200  1

any ideas,

thank you!!


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining the value of x at a given value of y in a smooth.spline object

2009-08-12 Thread Greg Snow
A few functions that may help are splinefun or approxfun along with uniroot.


From: Kavitha Venkatesan [kavitha.venkate...@gmail.com]
Sent: Wednesday, August 12, 2009 3:43 PM
To: Greg Snow
Cc: r-help@r-project.org
Subject: Re: [R] Obtaining the value of x at a given value of y in a
smooth.spline object

I do want a programmatic solution. I was thinking of the linear
interpolation based on the 2 y-values bracketing the given y as well,
I think I will pursue this. Thanks a lot for your suggestion,

Kavitha

On Wed, Aug 12, 2009 at 4:25 PM, Greg Snow wrote:
> Part of the problem is that there could in theory be multiple x values that 
> result in the same y value.
>
> One approach if you are happy with something interactive rather than 
> programatical is to use the TkSpline function in the TeachingDemos package to 
> fit the spline function and drag the x-value until you find the y value that 
> you want.
>
> You can also look at the return from smooth.spline, find the y that is 
> closest to your desired value and then find the corresponding x, or find the 
> 2 y-values that bracket your choice and linearly interpolate the 
> corresponding x values.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of Kavitha Venkatesan
>> Sent: Wednesday, August 12, 2009 11:43 AM
>> To: r-help@r-project.org
>> Subject: [R] Obtaining the value of x at a given value of y in a
>> smooth.spline object
>>
>> I have some data fit to a smooth.spline object as follows: (x=vector of
>> data
>> for the predictor variable, y=vector of data for the response variable)
>>
>> fit <- smooth.spline(x,y)
>>
>> Now, given a spline fit point y_new, I want to be able to find out what
>> value of x_new yielded this fit value. How to do so?
>> (This problem is the inverse of the predict.smooth.spline function,
>> which
>> takes x_new as input and yields the corresponding y_new fit value)
>>
>> Any insight is much appreciated!
>>
>> Thanks,
>> Kavitha
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
> But if the 1st order differences are the same, then doesn't it follow that 
> the 2nd, 3rd, ... order differences must be the same between the original and 
> the new "random" vector.  What am I missing?

You are missing nothing sorry, I wrote something wrong. What I would
like to be preserved is the distance with the *nearest* neighbor, so
diff is not the way to go. If you only consider the nearest neighbor,
then

c(3,4, 8,9) and c(4,5,6,7) are the same in terms of first order (all
closest neighbor are 1 unit away) but not in terms of second order.

Also, I don't know if there would be a simple way to maintain a
*distribution* of distances (even if not of nearest neighbor).
For example, c(2,4,5,6) could be c(1,3,4,5), c(3,5,6,7) as proposed by
your solution, but it could also be: c(4,5,6,8)
Or, c(2,3,6,7,8) could be c(2,3,4,7,8)

Actually, that's really simple! I can simply resample the "diff" vector!

OK so the only problem becomes the 1st, 2d, 3rd order thing now, but
you made me realize that I can skip it for the moment.

Thank you! :-)

Emmanuel



>
> Dan
>
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA  98504-5204
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Nordlund, Dan (DSHS/RDA)
> -Original Message-
> From: Emmanuel Levy [mailto:emmanuel.l...@gmail.com]
> Sent: Wednesday, August 12, 2009 4:48 PM
> To: Nordlund, Dan (DSHS/RDA)
> Cc: r-h...@stat.math.ethz.ch; dev djomson
> Subject: Re: [R] Random sampling while keeping distribution of nearest 
> neighbor
> distances constant.
> 
> Dear Daniel,
> 
> Thank a lot for your suggestion. It is helpful and got me thinking
> more about it so that I can rephrase it:
> 
> Given a vector V containing X values, comprised within 1 and N. I'd
> like to sample values so that the *distribution* of distances between
> the X values is similar.
> 
> There are several distributions: the 1st order would be given by the
> function diff.
> The 2d order distribution would be given by
> diff(V[seq(1,length(V),by=2)]) and diff(V[seq(2,length(V),by=2)])
> The 3rd order distribution diff(V[seq(1,length(V),by=3)]) and
> diff(V[seq(2,length(V),by=3)]) and diff(V[seq(3,length(V),by=3)])
> The 4th order 
> 
> I would like to produce different samples, where the first, or first
> and second, or first and second and third, or up to say five orders
> distance distributions are reproduced.
> 
> Is anybody aware of a formalism that is explained in a book and that
> could help me deal with this problem? Or even better of a package?
> 
> Thanks for your help,
> 
> Emmanuel
> 
> 

But if the 1st order differences are the same, then doesn't it follow that the 
2nd, 3rd, ... order differences must be the same between the original and the 
new "random" vector.  What am I missing?

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulating points from GLM corresponding to new x-values

2009-08-12 Thread Clifford Long
Hi Jacob,

At the risk of embarrassing myself, I gave it a shot.  I'll throw this
out on the list, if for no other reason than perhaps someone with
higher wattage than myself might tear it apart and give you something
both useful and perhaps elegant (from an R coding standpoint).

(see the following ... it just ran for me, so I hope it will for you, too)

If this isn't what you need, I'll shut up and watch and learn from the others.

Regards,

Cliff


I tried to put together something that might work as an example ...
based on a simple linear regression model, but using the GLM routine.

Once the model was created, I used 'predict' based on the model
outcome and the original x values.

I then used 'predict' based on the model outcome and new x values,
along with a function for simulation of the distribution at the new x
values.

At the


#--
# Create sample data set to use with GLM
#   (assume first order linear model for now)
#--
b0 = 10
b1 = 0.3
x = sort(rep(seq(1,11, by=2), 10))

fn.y = function(x1){y1 = b0 + b1*x1 + rnorm(n=1, mean=0, sd=1)}

y = sapply(x, fn.y)


xydata = data.frame(cbind(x, y))


model1 = glm(y ~ x, family = gaussian, data = xydata)


#--
# Generate new x values
#   run new x values through 'predict'
#--

newx = data.frame(xnew = sort(rep(seq(2,10, by=2), 12)))

y.pred = predict(model1, newx, se.fit = TRUE)


#--
# Generate simulated values based on new x values
#   and function based on outcome of 'predict' routine
#--

fn.pred = function(fit, sefit){rnorm(n=1, mean=fit, sd=sefit*sqrt(60))}

pred.sim = sapply(y.pred$fit, fn.pred, y.pred$se.fit)


#--
# Generate simulated values based on orig x values
#   using 'simulate' routine
#--

y.sim = simulate(model1, nsim = 1)


#--
# Plot original x, y values
#   then add simulated y values from 'simulate' based on orig x values
#   and the add simulated values from 'predict' and function based on
new x values
#--
plot(x, y)
lines(x, y.sim$sim_1, col='red', type='p')
lines(newx[,1], pred.sim, col='darkblue', type='p')

#--
# END
#--



On Wed, Aug 12, 2009 at 2:51 PM, Jacob Nabe-Nielsen wrote:
> Hi Cliff -- thanks for the suggestion.
>
> I tried extracting the predicted mean and standard error using predict().
> Afterwards I simulated the dependent variable using rnorm(), with mean and
> standard deviation taken from the predict() function (sd = sqrt(n)*se). The
> points obtained this way were scattered far too much (compared to points
> obtained with simulate()) -- I am not quite sure why.
>
> Unfortunately the documentation of the simulate() function does not provide
> much information about how it is implemented, which makes it difficult to
> judge which method is best (predict() or simulate(), and it is also unclear
> whether simulate() can be applied to glms (with family=gaussian or
> binomial).
>
> Any suggestions for how to proceed?
>
> Jacob
>
>
> On 12 Aug 2009, at 13:11, Clifford Long wrote:
>
>> Would the "predict" routine (using 'newdata') do what you need?
>>
>> Cliff Long
>> Hollister Incorporated
>>
>>
>>
>> On Wed, Aug 12, 2009 at 4:33 AM, Jacob Nabe-Nielsen
>> wrote:
>>>
>>> Dear List,
>>>
>>> Does anyone know how to simulate data from a GLM object correponding
>>> to values of the independent (x) variable that do not occur in the
>>> original dataset?
>>>
>>> I have tried using simulate(), but it generates a new value of the
>>> dependent variable corresponding to each of the original x-values,
>>> which is not what I need. Ideally I whould like to simulate new values
>>> for GLM objects both with family="gaussian" and with family="binomial".
>>>
>>> Thanks in advance,
>>> Jacob
>>>
>>> Jacob Nabe-Nielsen, PhD, MSc
>>> Scientist
>>>  --
>>> Section for Climate Effects and System Modelling
>>> Department of Arctic Environment
>>> National Enviornmental Research Institute
>>> Aarhus University
>>> Frederiksborgvej 399, Postbox 358
>>> 4000 Roskilde, Denmark
>>>
>>> email: n...@dmu.dk
>>> fax: +45 4630 1914
>>> phone: +45 4630 1944
>>>
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
>

__
R-help@r-project.org mailing

Re: [R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
Dear Daniel,

Thank a lot for your suggestion. It is helpful and got me thinking
more about it so that I can rephrase it:

Given a vector V containing X values, comprised within 1 and N. I'd
like to sample values so that the *distribution* of distances between
the X values is similar.

There are several distributions: the 1st order would be given by the
function diff.
The 2d order distribution would be given by
diff(V[seq(1,length(V),by=2)]) and diff(V[seq(2,length(V),by=2)])
The 3rd order distribution diff(V[seq(1,length(V),by=3)]) and
diff(V[seq(2,length(V),by=3)]) and diff(V[seq(3,length(V),by=3)])
The 4th order 

I would like to produce different samples, where the first, or first
and second, or first and second and third, or up to say five orders
distance distributions are reproduced.

Is anybody aware of a formalism that is explained in a book and that
could help me deal with this problem? Or even better of a package?

Thanks for your help,

Emmanuel




2009/8/12 Nordlund, Dan (DSHS/RDA) :
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
>> Behalf Of Emmanuel Levy
>> Sent: Wednesday, August 12, 2009 3:05 PM
>> To: r-h...@stat.math.ethz.ch
>> Cc: dev djomson
>> Subject: [R] Random sampling while keeping distribution of nearest neighbor
>> distances constant.
>>
>> Dear All,
>>
>> I cannot find a solution to the following problem although I imagine
>> that it is a classic, hence my email.
>>
>> I have a vector V of X values comprised between 1 and N.
>>
>> I would like to get random samples of X values also comprised between
>> 1 and N, but the important point is:
>> * I would like to keep the same distribution of distances between the X 
>> values *
>>
>> For example let's say N=10 and I have V = c(3,4,5,6)
>> then the random values could be 1,2,3,4 or 2,3,4,5 or 3,4,5,6, or 4,5,6,7 
>> etc..
>> so that the distribution of distances (3 <-> 4, 3 <->5, 3 <-> 6, 4 <->
>> 5, 4 <-> 6 etc ...) is kept constant.
>>
>> I couldn't find a package that help me with this, but it looks like it
>> should be a classic problem so there should be something!
>>
>> Many thanks in advance for any help or hint you could provide,
>>
>> All the best,
>>
>> Emmanuel
>>
>
> Emmanuel,
>
> I don't know if this is a classic problem or not.  But given your 
> description, you write your own function something like this
>
> sample.dist <- function(vec, Min=1, Max=10){
>  diffs <- c(0,diff(vec))
>  sum_d <- sum(diffs)
>  sample(Min:(Max-sum_d),1)+cumsum(diffs)
>  }
>
> Where Min and Max are the minimum and maximum values that you are sampling 
> from (Min=1 and Max=10 in your example), and vec is passed the vector that 
> you are sampling distances from.  This assumes that your vector is sorted 
> smallest to largest as in your example.   The function could be changed to 
> accommodate a vector that isn't sorted.
>
>> V <- sort(sample(1:100,4))
>> V
> #[1] 46 78 82 95
>> sample.dist(V, Min=1, Max=100)
> #[1] 36 68 72 85
>> sample.dist(V, Min=1, Max=100)
> #[1] 12 44 48 61
>>
> This should get you started at least.  Hope this is helpful,
>
> Dan
>
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA  98504-5204
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Integral

2009-08-12 Thread Moshe Olshansky
Hi,

Is your matrix K symmetric? If yes, there is an "analytical" solution.

--- On Sat, 1/8/09, nhawrylyshyn  wrote:

> From: nhawrylyshyn 
> Subject: [R]  Matrix Integral
> To: r-help@r-project.org
> Received: Saturday, 1 August, 2009, 12:15 AM
> 
> Hi,
> 
> Any help on this would be appreciated:
> 
> I need to integrate where K is a 4x4 matrix, and SIGMA is a
> 4x4 matrix from
> say a to b, i.e. 0 to 5:
> 
> integral  MatrixExp(-K * s) %*% SIGMA %*% t(SIGMA) %*%
> MatrixExp(t(-K) s) ds
> 
> t is tranpose , %*% : matrix mult , MatrixExp : matrix
> exponential
> 
> I've use integrate before on univariate functions like f(x)
> = x^2 which is
> fine but when doing this on a matrix I run into problems.
> All I intuitively
> need to do is do this element by element.
> 
> Thanks,
> 
> NH.
> 
> 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Matrix-Integral-tp24757170p24757170.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random sampling while keeping distribution of nearest ne

2009-08-12 Thread Emmanuel Levy
Thanks for your suggestion Ted,

This would indeed work for the particular example I gave, but I am
looking for a general solution.

For example, if my values are: V=c(2,4,5,6)
Then there would be two possibilities: 2,4,5,6 or 4,5,6,8
more generally, what I mean is that the matrix of distances between
pairs of values in V should be similar in the vector of random values.

Note that in practice, N is around 7,000,000 and X=length(V) may vary
between 20,000 and 500,000.

It'd be great if you could point me out to the name of this class of
problem, to a book, or to a package that could help me solve it.

Many thanks!

Emmanuel


PS: I apologize that I sent a second post. This one did not appear in
my "R-help" label so I assumed it wasn't sent for some reason.





2009/8/12 Ted Harding :
> On 12-Aug-09 22:05:24, Emmanuel Levy wrote:
>> Dear All,
>> I cannot find a solution to the following problem although I imagine
>> that it is a classic, hence my email.
>>
>> I have a vector V of X values comprised between 1 and N.
>>
>> I would like to get random samples of X values also comprised between
>> 1 and N, but the important point is:
>> * I would like to keep the same distribution of distances between the X
>> values *
>>
>> For example let's say N=10 and I have V = c(3,4,5,6)
>> then the random values could be 1,2,3,4 or 2,3,4,5 or 3,4,5,6, or
>> 4,5,6,7 etc..
>> so that the distribution of distances (3 <-> 4, 3 <->5, 3 <-> 6, 4 <->
>> 5, 4 <-> 6 etc ...) is kept constant.
>>
>> I couldn't find a package that help me with this, but it looks like it
>> should be a classic problem so there should be something!
>>
>> Many thanks in advance for any help or hint you could provide,
>> All the best,
>> Emmanuel
>
> If I've understood you right, you are basically putting a sequence
> with given spacings in a random position amongst the available
> positions. In your example, you would randomly choose between
> 1,2,3,4/2,3,4,5/3,4,5,6/4,5,6,7/5,6,7,8/6,7,8,9/7,8,9,10/
>
> Hence a result Y could be:
>
>  A <- min(V)
>  L <- max(V) - A + 1
>  M <- (0:(N-L))
>  Y <- 1 + (V-A) + sample(M,1)
>
> I think this does it!
>
> 
> E-Mail: (Ted Harding) 
> Fax-to-email: +44 (0)870 094 0861
> Date: 12-Aug-09                                       Time: 23:49:22
> -- XFMail --
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Games in R

2009-08-12 Thread Ronggui Huang
Hi, another package can be found here http://r-forge.r-project.org/projects/fun/

2009/8/13 Bjørn Arild Mæland :
> Hi,
>
> There's a couple of games listed on crantastic: 
> http://crantastic.org/tags/games
>
> -Bjorn
>
> 2009/8/12 David Croll :
>> Hi everybody - this is an oddball question.
>>
>>
>> I wonder if anybody has programmed any games in R, such as Sudoku,
>> Tic-Tac-Toe and the like. Or even a flight simulator...
>>
>>
>> R mateys! Let's make some t-tests!
>>
>>
>> Regards, David
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Nordlund, Dan (DSHS/RDA)
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of Emmanuel Levy
> Sent: Wednesday, August 12, 2009 3:05 PM
> To: r-h...@stat.math.ethz.ch
> Cc: dev djomson
> Subject: [R] Random sampling while keeping distribution of nearest neighbor
> distances constant.
> 
> Dear All,
> 
> I cannot find a solution to the following problem although I imagine
> that it is a classic, hence my email.
> 
> I have a vector V of X values comprised between 1 and N.
> 
> I would like to get random samples of X values also comprised between
> 1 and N, but the important point is:
> * I would like to keep the same distribution of distances between the X 
> values *
> 
> For example let's say N=10 and I have V = c(3,4,5,6)
> then the random values could be 1,2,3,4 or 2,3,4,5 or 3,4,5,6, or 4,5,6,7 
> etc..
> so that the distribution of distances (3 <-> 4, 3 <->5, 3 <-> 6, 4 <->
> 5, 4 <-> 6 etc ...) is kept constant.
> 
> I couldn't find a package that help me with this, but it looks like it
> should be a classic problem so there should be something!
> 
> Many thanks in advance for any help or hint you could provide,
> 
> All the best,
> 
> Emmanuel
> 

Emmanuel,

I don't know if this is a classic problem or not.  But given your description, 
you write your own function something like this

sample.dist <- function(vec, Min=1, Max=10){
  diffs <- c(0,diff(vec))
  sum_d <- sum(diffs)
  sample(Min:(Max-sum_d),1)+cumsum(diffs)
  }

Where Min and Max are the minimum and maximum values that you are sampling from 
(Min=1 and Max=10 in your example), and vec is passed the vector that you are 
sampling distances from.  This assumes that your vector is sorted smallest to 
largest as in your example.   The function could be changed to accommodate a 
vector that isn't sorted.

> V <- sort(sample(1:100,4))
> V
#[1] 46 78 82 95
> sample.dist(V, Min=1, Max=100)
#[1] 36 68 72 85
> sample.dist(V, Min=1, Max=100)
#[1] 12 44 48 61
>
This should get you started at least.  Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA  98504-5204
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] metaplot in rmeta: y-axis disappears

2009-08-12 Thread David Scott

Roaman wrote:

Thank you, the x-axis includes zero, but even, when I try to plot it not
including zero, the y-axis kepps
disappearing. One can't even add the axis afterwards!

Here is an example:

library(rmeta)

set.seed(123)

# simulated data:

theta <- 0.5
sds <- runif(50,0.8,8)
b <- rnorm(50,0,0.3)
e <- sapply(sds, function(x) rnorm(1,0,sds))
theta_i <- theta + b + e

# funnel plot:
#
plot(1/(sds^2),theta_i)

# short break:
#
Sys.sleep(3)

# metaplot:

metaplot(mn = theta_i, se = sds,xlim=c(5,20))

# plot funnel plot again:

plot(1/(sds^2),theta_i)

# ---> y-axis disappeared!

# try to add y-axis:
#
axis(2, at=-4:7, labels=-4:7)

# ---> nothing happens.



OK with some code it is possible to track this down and fairly easily 
too. Clearly metaplot altered something and you can guess that it would 
be some graphical parameter or possibly some option.


If you compare the output of par() before and after calling metaplot you 
will see one thing which is likely to cause a problem. yaxt has changed 
from having value "s" to having value "n".


If you do

par(yaxt="s")
axis(2, at=-4:7, labels=-4:7)

you should see the axis again.

This appears to be developer error: metaplot should restore parameters 
like this to what they were before it was called. Looking at the code 
xaxt is restored to its previous value, but not yaxt. I am not sure why 
either of them needs to be changed however outside of the call to plot.


I have copied this to Thomas.

David


_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
Dear All,(my apologies if it got posted twice, it seems it didn't
get through)

I cannot find a solution to the following problem although I suppose
this is a classic.

I have a vector V of X=length(V) values comprised between 1 and N.

I would like to get random samples of X values also comprised between
1 and N, but the important point is:
* I would like to keep the same distribution of distances between the
original X values *

For example let's say N=10 and I have V = c(3,4,5,6)
then the random values could be 1,2,3,4 or 2,3,4,5 or 3,4,5,6, or 4,5,6,7 etc..
so that the distribution of distances (3 <-> 4, 3 <->5, 3 <-> 6, 4 <->
5, 4 <-> 6 etc ...) is kept constant.

I couldn't find a package that help me with this, but it looks like it
should be a classic problem so there should be something!

Many thanks in advance for any help or hint you could provide,

All the best,

Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random sampling while keeping distribution of nearest ne

2009-08-12 Thread Ted Harding
On 12-Aug-09 22:05:24, Emmanuel Levy wrote:
> Dear All,
> I cannot find a solution to the following problem although I imagine
> that it is a classic, hence my email.
> 
> I have a vector V of X values comprised between 1 and N.
> 
> I would like to get random samples of X values also comprised between
> 1 and N, but the important point is:
> * I would like to keep the same distribution of distances between the X
> values *
> 
> For example let's say N=10 and I have V = c(3,4,5,6)
> then the random values could be 1,2,3,4 or 2,3,4,5 or 3,4,5,6, or
> 4,5,6,7 etc..
> so that the distribution of distances (3 <-> 4, 3 <->5, 3 <-> 6, 4 <->
> 5, 4 <-> 6 etc ...) is kept constant.
> 
> I couldn't find a package that help me with this, but it looks like it
> should be a classic problem so there should be something!
> 
> Many thanks in advance for any help or hint you could provide,
> All the best,
> Emmanuel

If I've understood you right, you are basically putting a sequence
with given spacings in a random position amongst the available
positions. In your example, you would randomly choose between
1,2,3,4/2,3,4,5/3,4,5,6/4,5,6,7/5,6,7,8/6,7,8,9/7,8,9,10/

Hence a result Y could be:

  A <- min(V)
  L <- max(V) - A + 1
  M <- (0:(N-L))
  Y <- 1 + (V-A) + sample(M,1)

I think this does it!


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 12-Aug-09   Time: 23:49:22
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting sigma symbol with unicode and turning into pdf

2009-08-12 Thread Jonathan R. Blaufuss
Paul,
  You solution worked out really well when I ran my code in R. However, when I 
try to turn the plot into a pdf, the unicode string no longer seems to function 
and instead of the sigma symbol there are just two periods (See example code 
below).

The following is the code working in the R environment just like I want it to 
look:

set.seed(1) 
Data=rnorm(100,sd=1) 
plot(density(Data)) 
text(25000,0.4,
paste("\u03c3 = ",
  format(round(sd(Data),digits=3),big.mark=",")),
font=2, col="blue")



Now when I try to turn the plot into a pdf, the sigma symbol no longer appears. 
It is replaced by two periods.

pdf(file="C:/Rquestion.pdf")  ### Note: You can choose a place for this file to 
be saved
set.seed(1) 
Data=rnorm(100,sd=1) 
plot(density(Data)) 
text(25000,0.4,
paste("\u03c3 = ",
  format(round(sd(Data),digits=3),big.mark=",")),
font=2, col="blue")
dev.off()


Is it possible to make the unicode string function when writing to a pdf or do 
I need to form the sigma symbol some other way?

Thanks for the help,

Jonathan




- Original Message -
From: "Paul Murrell" 
To: "Jonathan R. Blaufuss" 
Cc: "Scott Sherrill-Mix" , r-help@r-project.org
Sent: Wednesday, August 12, 2009 4:40:49 PM GMT -06:00 US/Canada Central
Subject: Re: [R] Using bold font with bquote

Hi

Jonathan R. Blaufuss wrote:
> Scott, Your suggestion works great for changing the numbers to bold
> font, but is it possible to change the sigma symbol and the equals
> sign to bold font as well? I've poked around ?plotmath and am I right
> in saying that there is a different method for controlling symbol
> fonts?


R graphics only recognizes a plain symbol font (it has this weird idea 
that a symbol font is a font face like bold or italic).

For your particular example, because it is not a complex math formula, 
you might be able to do a workaround by constructing a simple string and 
specifying the symbol that you want using Unicode.  Depending on what 
system and fonts you have, the following might work ...

text(25000,0.4,
 paste("\u03c2 = ",
format(round(sd(Data),digits=3),big.mark=",")),
 font=2, col="blue")


Paul


> Thank you for your help,
> 
> Jonathan
> 
> - Original Message - From: "Scott Sherrill-Mix"
>  To: "Jonathan R. Blaufuss"
>  Cc: r-help@r-project.org Sent: Wednesday,
> August 12, 2009 12:43:12 PM GMT -06:00 US/Canada Central Subject: Re:
> [R] Using bold font with bquote
> 
>> From ?plotmath, it looks like when using expressions you set the
>> font
> inside the expression (e.g. bold(x)). It looks you tried this already
>  but I wonder if there was something tiny out of place since the 
> following works for me:
> 
> text(25000,0.3,bquote(bold(sigma==.(mySigma)),list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
>  col='blue')
> 
> Scott
> 
> Scott Sherrill-Mix Department of Microbiology University of
> Pennsylvania 402B Johnson Pavilion 3610 Hamilton Walk Philadelphia,
> PA  19104-6076
> 
> 
> 
> On Wed, Aug 12, 2009 at 12:36 PM, Jonathan R. 
> Blaufuss wrote:
>> I'm trying to annotate a density plot and I'm using bquote to paste
>> the sigma symbol next to the numeric text of the standard deviation
>> calculation that I am performing. I have been able to successfully
>> turn the sigma symbol and numeric output the color blue, but when I
>> try to change the font of the text to bold, R doesn't seem to
>> recognize the "font=" command in the same way here as it does with
>> "col=". (My code is below)
>> 
>> set.seed(1) Data=rnorm(100,sd=1) plot(density(Data)) 
>> text(25000,0.3, bquote(sigma==.(mySigma), 
>> list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))), 
>> col="blue")
>> 
>> After searching the help files I've tried using the expression
>> command with "bold()" as well as inserting "font=2" after the color
>> command. However, I can't seem to get it to work.
>> 
>> Can someone please point me to a resource that will help me figure
>> this out?
>> 
>> Thank You,
>> 
>> Jonathan
>> 
>> __ R-help@r-project.org
>> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
>> read the posting guide http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> __ R-help@r-project.org
> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
> read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/l

Re: [R] Another Plotting Hint - changing fill color for points

2009-08-12 Thread Jason Rupert
This worked great. 


Regarding the second question - can you expound a bit more on the effect of the 
device?  Right now, as shown by the test code because I am on a Windows machine 
and need to import the image to Word I am using WMF.  Is there a better device 
that I should use that will help with the presolution of the points that are 
drawn?  For example, I can see white pixesl on the blue circles, and notice 
that is not a perfect circle.  Thanks for any information about a better way to 
go:

win.metafile(file=as.character(figure_file_name_and_path), pointsize = 10)
plot(-4:4, -4:4, type = "n")# setting up coord. system
points(vals_201, vals_200, col = "red", bg = "grey", pch=21, cex =1..7)
#points(vals_201, vals_200, col = "grey", bg = "white", pch=21, cex =1.5)
points(rnorm(100)/2, rnorm(100)/2, col = "blue", bg = "blue",  pch=21, cex =1.5)
dev.off()


Thanks again. 

--- On Wed, 8/12/09, Sarah Goslee  wrote:

> From: Sarah Goslee 
> Subject: Re: [R] Another Plotting Hint - changing fill color for points
> To: "Jason Rupert" , "r-help" 
> Date: Wednesday, August 12, 2009, 4:56 PM
> Yes, you can do that. You need to
> specify pch in the range of 21-25,
> and can then
> specify both col and bg (background color). Oddly, the help
> for this option is
> under ?points rather than ?par or ?pch, but there are many
> examples.
> 
> Your second question would depend heavily on the device you
> use and its
> associated settings, but using the above solution should
> solve your problem.
> 
> Sarah
> 
> On Wed, Aug 12, 2009 at 5:29 PM, Jason Rupert
> wrote:
> >
> > Is it possible to change the fill color of a point?
>  For example, the outer color being "Blue" and inner color
> being "Grey".
> > I've tried changing "col" and "bg", but that does not
> seem to have the desired effect.
> >
> > Below is another attempt, but the pixel resolution of
> the points function does not appear to be high enough for
> this to work:
> >
> > figure_file_name_and_path<-paste("Test.wmf",
> sep="")
> >
> > vals_200<-rnorm(200)
> > vals_201<-rnorm(200)
> >
> >
> win.metafile(file=as.character(figure_file_name_and_path),
> pointsize = 10)
> > plot(-4:4, -4:4, type = "n")# setting up coord.
> system
> > points(vals_201, vals_200, col = "blue", bg = "white",
> pch=19, cex =1.7)
> > points(vals_201, vals_200, col = "grey", bg = "white",
> pch=19, cex =1.5)
> > points(rnorm(100)/2, rnorm(100)/2, col = "blue", bg =
> "white",  pch=19, cex =1.5)
> > dev.off()
> >
> > As a second question, is there any way to increase the
> pixel resolution of the points produced on the plot so that
> they are perfect circles.  I just noticed that the fill
> does not perfectly fill the points on the plot and there are
> some pixels outside the circle.
> >
> > Thanks again.
> >
> 
> 
> -- 
> Sarah Goslee
> http://www.functionaldiversity.org
> 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random sampling while keeping distribution of nearest neighbor distances constant.

2009-08-12 Thread Emmanuel Levy
Dear All,

I cannot find a solution to the following problem although I imagine
that it is a classic, hence my email.

I have a vector V of X values comprised between 1 and N.

I would like to get random samples of X values also comprised between
1 and N, but the important point is:
* I would like to keep the same distribution of distances between the X values *

For example let's say N=10 and I have V = c(3,4,5,6)
then the random values could be 1,2,3,4 or 2,3,4,5 or 3,4,5,6, or 4,5,6,7 etc..
so that the distribution of distances (3 <-> 4, 3 <->5, 3 <-> 6, 4 <->
5, 4 <-> 6 etc ...) is kept constant.

I couldn't find a package that help me with this, but it looks like it
should be a classic problem so there should be something!

Many thanks in advance for any help or hint you could provide,

All the best,

Emmanuel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Another Plotting Hint - changing fill color for points

2009-08-12 Thread Sarah Goslee
Yes, you can do that. You need to specify pch in the range of 21-25,
and can then
specify both col and bg (background color). Oddly, the help for this option is
under ?points rather than ?par or ?pch, but there are many examples.

Your second question would depend heavily on the device you use and its
associated settings, but using the above solution should solve your problem.

Sarah

On Wed, Aug 12, 2009 at 5:29 PM, Jason Rupert wrote:
>
> Is it possible to change the fill color of a point?  For example, the outer 
> color being "Blue" and inner color being "Grey".
> I've tried changing "col" and "bg", but that does not seem to have the 
> desired effect.
>
> Below is another attempt, but the pixel resolution of the points function 
> does not appear to be high enough for this to work:
>
> figure_file_name_and_path<-paste("Test.wmf", sep="")
>
> vals_200<-rnorm(200)
> vals_201<-rnorm(200)
>
> win.metafile(file=as.character(figure_file_name_and_path), pointsize = 10)
> plot(-4:4, -4:4, type = "n")# setting up coord. system
> points(vals_201, vals_200, col = "blue", bg = "white", pch=19, cex =1.7)
> points(vals_201, vals_200, col = "grey", bg = "white", pch=19, cex =1.5)
> points(rnorm(100)/2, rnorm(100)/2, col = "blue", bg = "white",  pch=19, cex 
> =1.5)
> dev.off()
>
> As a second question, is there any way to increase the pixel resolution of 
> the points produced on the plot so that they are perfect circles.  I just 
> noticed that the fill does not perfectly fill the points on the plot and 
> there are some pixels outside the circle.
>
> Thanks again.
>


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] psi not functioning in nlrob?

2009-08-12 Thread Xiao Xiao
Hi all,

I'm trying to fit a nonlinear regression by "nlrob":

 model3=nlrob(y~a1*x^a2,data=transient,psi=psi.bisquare,
start=list(a1=0.02,a2=0.7),maxit=1000)

However an error message keeps popping up saying that the function
psi.bisquare doesn't exist.

I also tried psi.huber, which is supposed to be the default for nlrob:

model3=nlrob(y~a1*x^a2,data=transient,psi=psi.huber,
start=list(a1=0.02,a2=0.7),maxit=1000)

But I still got the same error message - psi.huber doesn't exist.

Is the argument "psi" not available in nlrob?

Any help will be appreciated.

Best,
Xiao Xiao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining the value of x at a given value of y in a smooth.spline object

2009-08-12 Thread Kavitha Venkatesan
I do want a programmatic solution. I was thinking of the linear
interpolation based on the 2 y-values bracketing the given y as well,
I think I will pursue this. Thanks a lot for your suggestion,

Kavitha

On Wed, Aug 12, 2009 at 4:25 PM, Greg Snow wrote:
> Part of the problem is that there could in theory be multiple x values that 
> result in the same y value.
>
> One approach if you are happy with something interactive rather than 
> programatical is to use the TkSpline function in the TeachingDemos package to 
> fit the spline function and drag the x-value until you find the y value that 
> you want.
>
> You can also look at the return from smooth.spline, find the y that is 
> closest to your desired value and then find the corresponding x, or find the 
> 2 y-values that bracket your choice and linearly interpolate the 
> corresponding x values.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of Kavitha Venkatesan
>> Sent: Wednesday, August 12, 2009 11:43 AM
>> To: r-help@r-project.org
>> Subject: [R] Obtaining the value of x at a given value of y in a
>> smooth.spline object
>>
>> I have some data fit to a smooth.spline object as follows: (x=vector of
>> data
>> for the predictor variable, y=vector of data for the response variable)
>>
>> fit <- smooth.spline(x,y)
>>
>> Now, given a spline fit point y_new, I want to be able to find out what
>> value of x_new yielded this fit value. How to do so?
>> (This problem is the inverse of the predict.smooth.spline function,
>> which
>> takes x_new as input and yields the corresponding y_new fit value)
>>
>> Any insight is much appreciated!
>>
>> Thanks,
>> Kavitha
>>
>>       [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using bold font with bquote

2009-08-12 Thread Paul Murrell

Hi

Jonathan R. Blaufuss wrote:

Scott, Your suggestion works great for changing the numbers to bold
font, but is it possible to change the sigma symbol and the equals
sign to bold font as well? I've poked around ?plotmath and am I right
in saying that there is a different method for controlling symbol
fonts?



R graphics only recognizes a plain symbol font (it has this weird idea 
that a symbol font is a font face like bold or italic).


For your particular example, because it is not a complex math formula, 
you might be able to do a workaround by constructing a simple string and 
specifying the symbol that you want using Unicode.  Depending on what 
system and fonts you have, the following might work ...


text(25000,0.4,
 paste("\u03c2 = ",
   format(round(sd(Data),digits=3),big.mark=",")),
 font=2, col="blue")


Paul



Thank you for your help,

Jonathan

- Original Message - From: "Scott Sherrill-Mix"
 To: "Jonathan R. Blaufuss"
 Cc: r-help@r-project.org Sent: Wednesday,
August 12, 2009 12:43:12 PM GMT -06:00 US/Canada Central Subject: Re:
[R] Using bold font with bquote


From ?plotmath, it looks like when using expressions you set the
font

inside the expression (e.g. bold(x)). It looks you tried this already
 but I wonder if there was something tiny out of place since the 
following works for me:


text(25000,0.3,bquote(bold(sigma==.(mySigma)),list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
 col='blue')

Scott

Scott Sherrill-Mix Department of Microbiology University of
Pennsylvania 402B Johnson Pavilion 3610 Hamilton Walk Philadelphia,
PA  19104-6076



On Wed, Aug 12, 2009 at 12:36 PM, Jonathan R. 
Blaufuss wrote:

I'm trying to annotate a density plot and I'm using bquote to paste
the sigma symbol next to the numeric text of the standard deviation
calculation that I am performing. I have been able to successfully
turn the sigma symbol and numeric output the color blue, but when I
try to change the font of the text to bold, R doesn't seem to
recognize the "font=" command in the same way here as it does with
"col=". (My code is below)

set.seed(1) Data=rnorm(100,sd=1) plot(density(Data)) 
text(25000,0.3, bquote(sigma==.(mySigma), 
list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))), 
col="blue")


After searching the help files I've tried using the expression
command with "bold()" as well as inserting "font=2" after the color
command. However, I can't seem to get it to work.

Can someone please point me to a resource that will help me figure
this out?

Thank You,

Jonathan

__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.




__ R-help@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.


--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Erik Iverson
This is where a small, reproducible example will definitely help us discover 
your problem. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman
Sent: Wednesday, August 12, 2009 4:29 PM
To: Achim Zeileis
Cc: r help
Subject: Re: [R] Nominal variables in SVM?

Thanks for all the suggestions.

My data was loaded in from a csv file with about 80 columns (3 of these 
columns are nominal)  no specific settings for the nominal columns.

Currently, if I call svm (e1071), I get an error about the nominal column.

Do I need to tell R to change the column to a factor?  i.e. foo$color <- 
factor(foo$color)


On 8/12/09 2:21 PM, Achim Zeileis wrote:
> On Wed, 12 Aug 2009, Noah Silverman wrote:
>
>> Hi,
>>
>> The answers to my previous question about nominal variables has lead 
>> me to a more important question.
>>
>> What is the "best practice" way to feed nominal variable to an SVM.
>
> As some of the previous posters have already indicated: The data 
> structure for storing categorical (including nominal) variables in R 
> is a "factor".
>
> Your comment about "truly nominal" is wrong. A character variable is a 
> character variable, not necessarily a categorical variable. 
> Categorical means that the answer falls into one of a finite number of 
> known categories, known as "levels" in R's "factor" class.
>
> If you start out from character information:
>
>   x <- c("red", "red", "blue", "green", "blue")
>
> You can turn it into a factor via:
>
>   x <- factor(x, levels = c("red", "green", "blue"))
>
> R now knows how to do certain things with such a variable, e.g., 
> produces useful summaries or knows how to deal with it in regression 
> problems:
>
>   model.matrix(~ x)
>
> which seems to be what you asked for. Moreover, you don't need call 
> this yourself but most regression functions in R will do that for you 
> (including svm() in "e1071" or ksvm() in "kernlab", among others).
>
> In short: Keep your categorical variables as "factor" columns in a 
> "data.frame" and use the formula interface of svm()/ksvm() and you are 
> fine.
> Z
>
>
>> For example:
>> color = ("red, "blue", "green")
>>
>> I could translate that into an index so I wind up with
>> color= (1,2,3)
>>
>> But my concern is that the SVM will now think that the values are 
>> numeric in "range" and not discrete conditions.
>>
>> Another thought would be to create 3 binary variables from the single 
>> color variable, so I have:
>>
>> red = (0,1)
>> blue = (0,1)
>> green = (0,1)
>>
>> A example fed to the SVM would have one positive and two negative 
>> values to indicate the color value:
>> i.e. for a blue example:
>> red = 0, blue =1 , green = 0
>>
>> Or, do any of the SVM packages intelligently handle this internally 
>> so that I don't have to mess with it.  If so, do I need to be 
>> concerned about different "translation" of the data if the test data 
>> set isn't exactly the same as the training set.
>> For example:
>> training data  =  color ("red, "blue", "green")
>> test data = color ("red, "green")
>>
>> How would I be sure that the "red" and "green" examples get encoded 
>> the same so that the SVM is accurate?
>>
>> Thanks in advance!!
>>
>> -N
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Another Plotting Hint - changing fill color for points

2009-08-12 Thread Jason Rupert

Is it possible to change the fill color of a point?  For example, the outer 
color being "Blue" and inner color being "Grey".  
I've tried changing "col" and "bg", but that does not seem to have the desired 
effect.

Below is another attempt, but the pixel resolution of the points function does 
not appear to be high enough for this to work:

figure_file_name_and_path<-paste("Test.wmf", sep="")

vals_200<-rnorm(200)
vals_201<-rnorm(200)

win.metafile(file=as.character(figure_file_name_and_path), pointsize = 10)
plot(-4:4, -4:4, type = "n")# setting up coord. system
points(vals_201, vals_200, col = "blue", bg = "white", pch=19, cex =1.7)
points(vals_201, vals_200, col = "grey", bg = "white", pch=19, cex =1.5)
points(rnorm(100)/2, rnorm(100)/2, col = "blue", bg = "white",  pch=19, cex 
=1.5)
dev.off()

As a second question, is there any way to increase the pixel resolution of the 
points produced on the plot so that they are perfect circles.  I just noticed 
that the fill does not perfectly fill the points on the plot and there are some 
pixels outside the circle. 

Thanks again.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combinatorial problem

2009-08-12 Thread David Scott

Bernd Bischl wrote:

Dimitris Rizopoulos wrote:

you could try something like the following:

groups <- list(gp1 = 1:3, gp2 = 4:5, gp3 = 6:7,
   gp4 = 8:10, gp5 = 11)

combn(5, 2, function (x) expand.grid(groups[x]), simplify = FALSE)
combn(5, 3, function (x) expand.grid(groups[x]), simplify = FALSE)
combn(5, 4, function (x) expand.grid(groups[x]), simplify = FALSE)


and this transforms it nicely into a single matrix

y <- combn(5, 2, function (x) as.matrix(expand.grid(groups[x])), 
simplify = FALSE)

Reduce(rbind, y)


Bernd



This is absolutely, totally awesome guys. Thanks very much.

For the benefit of other readers, here is what happened.

I spent 10 minutes composing a question (tried to make it easy to set 
up). Went off and did some other emailing, decided to check R-help and 
there were two replies answering my question perfectly.


Thanks again

David Scott


_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
OR, as Steve suggested in a previous post,  would it make more sense in 
training an SVM to convert a single nominal column into a series of 
binary columns?

"
color = ("red, "blue", "green")
So, imagine if the features for your examples were color and height, 
your "feature matrix" for N examples would be N x 4

0,1,0,15  # blue object, height 15
1,0,0,10  # red object, height 10
0,0,1,5 # green object, height 5
"

 From my LIMITED knowledge, it seems like an SVM would be more accurate 
with unique binary columns for each value of a nominal factor, but I'm 
not sure.

Can anyone provide an opinion on this?

Thanks!

-N


On 8/12/09 2:21 PM, Achim Zeileis wrote:
> On Wed, 12 Aug 2009, Noah Silverman wrote:
>
>> Hi,
>>
>> The answers to my previous question about nominal variables has lead 
>> me to a more important question.
>>
>> What is the "best practice" way to feed nominal variable to an SVM.
>
> As some of the previous posters have already indicated: The data 
> structure for storing categorical (including nominal) variables in R 
> is a "factor".
>
> Your comment about "truly nominal" is wrong. A character variable is a 
> character variable, not necessarily a categorical variable. 
> Categorical means that the answer falls into one of a finite number of 
> known categories, known as "levels" in R's "factor" class.
>
> If you start out from character information:
>
>   x <- c("red", "red", "blue", "green", "blue")
>
> You can turn it into a factor via:
>
>   x <- factor(x, levels = c("red", "green", "blue"))
>
> R now knows how to do certain things with such a variable, e.g., 
> produces useful summaries or knows how to deal with it in regression 
> problems:
>
>   model.matrix(~ x)
>
> which seems to be what you asked for. Moreover, you don't need call 
> this yourself but most regression functions in R will do that for you 
> (including svm() in "e1071" or ksvm() in "kernlab", among others).
>
> In short: Keep your categorical variables as "factor" columns in a 
> "data.frame" and use the formula interface of svm()/ksvm() and you are 
> fine.
> Z
>
>
>> For example:
>> color = ("red, "blue", "green")
>>
>> I could translate that into an index so I wind up with
>> color= (1,2,3)
>>
>> But my concern is that the SVM will now think that the values are 
>> numeric in "range" and not discrete conditions.
>>
>> Another thought would be to create 3 binary variables from the single 
>> color variable, so I have:
>>
>> red = (0,1)
>> blue = (0,1)
>> green = (0,1)
>>
>> A example fed to the SVM would have one positive and two negative 
>> values to indicate the color value:
>> i.e. for a blue example:
>> red = 0, blue =1 , green = 0
>>
>> Or, do any of the SVM packages intelligently handle this internally 
>> so that I don't have to mess with it.  If so, do I need to be 
>> concerned about different "translation" of the data if the test data 
>> set isn't exactly the same as the training set.
>> For example:
>> training data  =  color ("red, "blue", "green")
>> test data = color ("red, "green")
>>
>> How would I be sure that the "red" and "green" examples get encoded 
>> the same so that the SVM is accurate?
>>
>> Thanks in advance!!
>>
>> -N
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman
Thanks for all the suggestions.

My data was loaded in from a csv file with about 80 columns (3 of these 
columns are nominal)  no specific settings for the nominal columns.

Currently, if I call svm (e1071), I get an error about the nominal column.

Do I need to tell R to change the column to a factor?  i.e. foo$color <- 
factor(foo$color)


On 8/12/09 2:21 PM, Achim Zeileis wrote:
> On Wed, 12 Aug 2009, Noah Silverman wrote:
>
>> Hi,
>>
>> The answers to my previous question about nominal variables has lead 
>> me to a more important question.
>>
>> What is the "best practice" way to feed nominal variable to an SVM.
>
> As some of the previous posters have already indicated: The data 
> structure for storing categorical (including nominal) variables in R 
> is a "factor".
>
> Your comment about "truly nominal" is wrong. A character variable is a 
> character variable, not necessarily a categorical variable. 
> Categorical means that the answer falls into one of a finite number of 
> known categories, known as "levels" in R's "factor" class.
>
> If you start out from character information:
>
>   x <- c("red", "red", "blue", "green", "blue")
>
> You can turn it into a factor via:
>
>   x <- factor(x, levels = c("red", "green", "blue"))
>
> R now knows how to do certain things with such a variable, e.g., 
> produces useful summaries or knows how to deal with it in regression 
> problems:
>
>   model.matrix(~ x)
>
> which seems to be what you asked for. Moreover, you don't need call 
> this yourself but most regression functions in R will do that for you 
> (including svm() in "e1071" or ksvm() in "kernlab", among others).
>
> In short: Keep your categorical variables as "factor" columns in a 
> "data.frame" and use the formula interface of svm()/ksvm() and you are 
> fine.
> Z
>
>
>> For example:
>> color = ("red, "blue", "green")
>>
>> I could translate that into an index so I wind up with
>> color= (1,2,3)
>>
>> But my concern is that the SVM will now think that the values are 
>> numeric in "range" and not discrete conditions.
>>
>> Another thought would be to create 3 binary variables from the single 
>> color variable, so I have:
>>
>> red = (0,1)
>> blue = (0,1)
>> green = (0,1)
>>
>> A example fed to the SVM would have one positive and two negative 
>> values to indicate the color value:
>> i.e. for a blue example:
>> red = 0, blue =1 , green = 0
>>
>> Or, do any of the SVM packages intelligently handle this internally 
>> so that I don't have to mess with it.  If so, do I need to be 
>> concerned about different "translation" of the data if the test data 
>> set isn't exactly the same as the training set.
>> For example:
>> training data  =  color ("red, "blue", "green")
>> test data = color ("red, "green")
>>
>> How would I be sure that the "red" and "green" examples get encoded 
>> the same so that the SVM is accurate?
>>
>> Thanks in advance!!
>>
>> -N
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Achim Zeileis

On Wed, 12 Aug 2009, Noah Silverman wrote:


Hi,

The answers to my previous question about nominal variables has lead me 
to a more important question.


What is the "best practice" way to feed nominal variable to an SVM.


As some of the previous posters have already indicated: The data structure 
for storing categorical (including nominal) variables in R is a "factor".


Your comment about "truly nominal" is wrong. A character variable is a 
character variable, not necessarily a categorical variable. Categorical 
means that the answer falls into one of a finite number of known 
categories, known as "levels" in R's "factor" class.


If you start out from character information:

  x <- c("red", "red", "blue", "green", "blue")

You can turn it into a factor via:

  x <- factor(x, levels = c("red", "green", "blue"))

R now knows how to do certain things with such a variable, e.g., produces 
useful summaries or knows how to deal with it in regression problems:


  model.matrix(~ x)

which seems to be what you asked for. Moreover, you don't need call this 
yourself but most regression functions in R will do that for you 
(including svm() in "e1071" or ksvm() in "kernlab", among others).


In short: Keep your categorical variables as "factor" columns in a 
"data.frame" and use the formula interface of svm()/ksvm() and you are 
fine.

Z



For example:
color = ("red, "blue", "green")

I could translate that into an index so I wind up with
color= (1,2,3)

But my concern is that the SVM will now think that the values are numeric in 
"range" and not discrete conditions.


Another thought would be to create 3 binary variables from the single color 
variable, so I have:


red = (0,1)
blue = (0,1)
green = (0,1)

A example fed to the SVM would have one positive and two negative values to 
indicate the color value:

i.e. for a blue example:
red = 0, blue =1 , green = 0

Or, do any of the SVM packages intelligently handle this internally so that I 
don't have to mess with it.  If so, do I need to be concerned about different 
"translation" of the data if the test data set isn't exactly the same as the 
training set.

For example:
training data  =  color ("red, "blue", "green")
test data = color ("red, "green")

How would I be sure that the "red" and "green" examples get encoded the same 
so that the SVM is accurate?


Thanks in advance!!

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem loading ncdf library on MAC

2009-08-12 Thread Dan Kelley

Right.  I have udunits2 installed, but that's not enough.  It seems to want
udunits, instead, even though the latter is deprecated.

I downloaded the older package from

  http://www.unidata.ucar.edu/downloads/udunits/index.jsp

but it does not compile, e.g.

  ../port/cfortran/cfortran.h:133:3: error: #error "cfortran.h:  Can't find
your environment among: 

and then it lists a bunch of machines.

I'll look into this a bit more.  Perhaps the best plan would be to make
Rnetcdf look for "udunits2" not the older version "udunits" that it tries
for, presently.


Steve Lianoglou-6 wrote:
> 
> ...
> 

-- 
View this message in context: 
http://www.nabble.com/problem-loading-ncdf-library-on-MAC-tp23195195p24943424.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Erik Iverson
Noah, depending on what function you use, it might do this automatically for 
you if you give the function a formula containing a factor.  Otherwise, see 
?model.matrix.  

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman
Sent: Wednesday, August 12, 2009 3:59 PM
Cc: r help
Subject: Re: [R] Nominal variables in SVM?

That makes sense.

I my data is already nominal, I need to "expand" a single column into 
several binary ones.

Is there an easy function to do this in R, or do I need to create 
something from scratch?  (If I have to create my own, any suggestions?)

Thanks!

-N

On 8/12/09 1:55 PM, Steve Lianoglou wrote:
> Hi,
>
> On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote:
>
>> Hi,
>>
>> The answers to my previous question about nominal variables has lead 
>> me to a more important question.
>>
>> What is the "best practice" way to feed nominal variable to an SVM.
>>
>> For example:
>> color = ("red, "blue", "green")
>>
>> I could translate that into an index so I wind up with
>> color= (1,2,3)
>>
>> But my concern is that the SVM will now think that the values are 
>> numeric in "range" and not discrete conditions.
>>
>> Another thought would be to create 3 binary variables from the single 
>> color variable, so I have:
>>
>> red = (0,1)
>> blue = (0,1)
>> green = (0,1)
>>
>> A example fed to the SVM would have one positive and two negative 
>> values to indicate the color value:
>> i.e. for a blue example:
>> red = 0, blue =1 , green = 0
>
> Do it this way.
>
> So, imagine if the features for your examples were color and height, 
> your "feature matrix" for N examples would be N x 4
>
> 0,1,0,15  # blue object, height 15
> 1,0,0,10  # red object, height 10
> 0,0,1,5 # green object, height 5
> ...
>
> -steve
>
> -- 
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>   |  Memorial Sloan-Kettering Cancer Center
>   |  Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Bernd Bischl

Noah Silverman wrote:

That makes sense.

I my data is already nominal, I need to "expand" a single column into 
several binary ones.


Is there an easy function to do this in R, or do I need to create 
something from scratch?  (If I have to create my own, any suggestions?)


Thanks!

-N

Hi Noah,

read up on the "contrasts" and the "model.matrix" functions.

Although if you use the kernlab package for SVMs, factors get treated in 
this way by default, you just need to use the formula interface.


Bernd

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting Hints - how to add minor tics on axes

2009-08-12 Thread Gavin Simpson
On Wed, 2009-08-12 at 11:51 -0700, Jason Rupert wrote:
> Using the standard plotting routine in R, i.e. no special packages, is
> there a way to add in minor tics to the axes?  
> 
> Also, is there a way to make sure the major axes labels are at the
> origin?  When I'm looking at a plot, the major axes labels are
> present, but it looks like they start a bit away from the origin on
> the plot.  
> 
> Thanks again for any info and feedback.

This combines both aspects:


## read ?par and parameters xaxs and yaxs
plot(1:10, xaxs = "i", yaxs = "i")
## notice the above means that the point at 1,1 is obscured

## now add a second axis to the bottom (side 1) with ticks at the 
## specified locations, and shorter tick marks.
## we also suppress the tick labels.
axis(side = 1, at = seq(0, 10, by = 0.1),
 labels = FALSE, tcl = -0.2)

See ?axis for more details on that one.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman

That makes sense.

I my data is already nominal, I need to "expand" a single column into 
several binary ones.


Is there an easy function to do this in R, or do I need to create 
something from scratch?  (If I have to create my own, any suggestions?)


Thanks!

-N

On 8/12/09 1:55 PM, Steve Lianoglou wrote:

Hi,

On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote:


Hi,

The answers to my previous question about nominal variables has lead 
me to a more important question.


What is the "best practice" way to feed nominal variable to an SVM.

For example:
color = ("red, "blue", "green")

I could translate that into an index so I wind up with
color= (1,2,3)

But my concern is that the SVM will now think that the values are 
numeric in "range" and not discrete conditions.


Another thought would be to create 3 binary variables from the single 
color variable, so I have:


red = (0,1)
blue = (0,1)
green = (0,1)

A example fed to the SVM would have one positive and two negative 
values to indicate the color value:

i.e. for a blue example:
red = 0, blue =1 , green = 0


Do it this way.

So, imagine if the features for your examples were color and height, 
your "feature matrix" for N examples would be N x 4


0,1,0,15  # blue object, height 15
1,0,0,10  # red object, height 10
0,0,1,5 # green object, height 5
...

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nominal to numeric function

2009-08-12 Thread Daniel Malter
foo <- c("blue", "red", "green")
foo=as.factor(foo)
foo=as.numeric(as.character(foo))
foo

#the numeric ordering is alphabetic

hth,
daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im
Auftrag von Noah Silverman
Gesendet: Wednesday, August 12, 2009 2:09 PM
Cc: r-help@r-project.org
Betreff: Re: [R] nominal to numeric function

Hi,

Thanks for the tip,

Neither method works as my data is truly nominal

-
foo <- c("blue", "red", "green")
as.numeric(foo)
[1] NA NA NA
Warning message:
NAs introduced by coercion


Rapid Miner has a function that will automatically create an "index" of the
values and create a new variable (or replace the existing).  It also has a
second function that will break the nominal labels into n variable that are
binary:

i.e.

red(0,1)
blue(0,1)
green(0,1)


Or,  I guess I could go back to the source that generates my data and
institute a numeric key for the nominal items.  That seems like the longest
way, but probably the safest to get what I want.

-N



On 8/12/09 9:10 AM, Phil Spector wrote:
> It's generally safer to use
>
>as.numeric(as.character(variablename))
>
> since it eliminates problems associated with factors.
>
>- Phil Spector
>  Statistical Computing Facility
>  Department of Statistics
>  UC Berkeley
>  spec...@stat.berkeley.edu
>
>
> On Wed, 12 Aug 2009, Daniel Malter wrote:
>
>>
>> Hi you can use newvariable=as.numeric(variablename). This converts 
>> your factors into numeric variables, but not always with the desired 
>> result. So make sure that you check whether "newvariable" gives you 
>> what you want.
>> Otherwise recoding by hand is indicated.
>>
>> Best,
>> Daniel
>>
>>
>>
>> Noah Silverman-3 wrote:
>>>
>>> Hi,
>>>
>>> I'm training an SVM (C-classification from e1071 library)
>>>
>>> Some of the variables in my data set are nominal.  Is there some 
>>> easy/automatic way to convert them to numerical representations?
>>>
>>> Thanks,
>>>
>>> -N
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> View this message in context: 
>> http://www.nabble.com/nominal-to-numeric-function-tp24930466p24939723
>> .html
>>
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nominal variables in SVM?

2009-08-12 Thread Steve Lianoglou

Hi,

On Aug 12, 2009, at 2:53 PM, Noah Silverman wrote:


Hi,

The answers to my previous question about nominal variables has lead  
me to a more important question.


What is the "best practice" way to feed nominal variable to an SVM.

For example:
color = ("red, "blue", "green")

I could translate that into an index so I wind up with
color= (1,2,3)

But my concern is that the SVM will now think that the values are  
numeric in "range" and not discrete conditions.


Another thought would be to create 3 binary variables from the  
single color variable, so I have:


red = (0,1)
blue = (0,1)
green = (0,1)

A example fed to the SVM would have one positive and two negative  
values to indicate the color value:

i.e. for a blue example:
red = 0, blue =1 , green = 0


Do it this way.

So, imagine if the features for your examples were color and height,  
your "feature matrix" for N examples would be N x 4


0,1,0,15  # blue object, height 15
1,0,0,10  # red object, height 10
0,0,1,5 # green object, height 5
...

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting Hints - how to add minor tics on axes

2009-08-12 Thread Greg Snow
You can use the axis function to add tick marks and labels at specified 
positions (including whatever origin you want to use).  If you add shorter tick 
marks without labels, then they are minor ticks (usually you will suppress the 
default axes using axes=FALSE or xaxt/yaxt='n', then use the axis function to 
add in your own custom ticks and labels (call once for major ticks, a second 
time for minor).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Jason Rupert
> Sent: Wednesday, August 12, 2009 12:51 PM
> To: R-help@r-project.org
> Subject: [R] Plotting Hints - how to add minor tics on axes
> 
> Using the standard plotting routine in R, i.e. no special packages, is
> there a way to add in minor tics to the axes?
> 
> Also, is there a way to make sure the major axes labels are at the
> origin?  When I'm looking at a plot, the major axes labels are present,
> but it looks like they start a bit away from the origin on the plot.
> 
> Thanks again for any info and feedback.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulating points from GLM corresponding to new x-values

2009-08-12 Thread Jacob Nabe-Nielsen

Hi Cliff -- thanks for the suggestion.

I tried extracting the predicted mean and standard error using  
predict(). Afterwards I simulated the dependent variable using  
rnorm(), with mean and standard deviation taken from the predict()  
function (sd = sqrt(n)*se). The points obtained this way were  
scattered far too much (compared to points obtained with simulate())  
-- I am not quite sure why.


Unfortunately the documentation of the simulate() function does not  
provide much information about how it is implemented, which makes it  
difficult to judge which method is best (predict() or simulate(), and  
it is also unclear whether simulate() can be applied to glms (with  
family=gaussian or binomial).


Any suggestions for how to proceed?

Jacob


On 12 Aug 2009, at 13:11, Clifford Long wrote:


Would the "predict" routine (using 'newdata') do what you need?

Cliff Long
Hollister Incorporated



On Wed, Aug 12, 2009 at 4:33 AM, Jacob Nabe- 
Nielsen wrote:

Dear List,

Does anyone know how to simulate data from a GLM object correponding
to values of the independent (x) variable that do not occur in the
original dataset?

I have tried using simulate(), but it generates a new value of the
dependent variable corresponding to each of the original x-values,
which is not what I need. Ideally I whould like to simulate new  
values
for GLM objects both with family="gaussian" and with  
family="binomial".


Thanks in advance,
Jacob

Jacob Nabe-Nielsen, PhD, MSc
Scientist
 --
Section for Climate Effects and System Modelling
Department of Arctic Environment
National Enviornmental Research Institute
Aarhus University
Frederiksborgvej 399, Postbox 358
4000 Roskilde, Denmark

email: n...@dmu.dk
fax: +45 4630 1914
phone: +45 4630 1944


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nominal variables in SVM?

2009-08-12 Thread Noah Silverman

Hi,

The answers to my previous question about nominal variables has lead me 
to a more important question.


What is the "best practice" way to feed nominal variable to an SVM.

For example:
color = ("red, "blue", "green")

I could translate that into an index so I wind up with
color= (1,2,3)

But my concern is that the SVM will now think that the values are 
numeric in "range" and not discrete conditions.


Another thought would be to create 3 binary variables from the single 
color variable, so I have:


red = (0,1)
blue = (0,1)
green = (0,1)

A example fed to the SVM would have one positive and two negative values 
to indicate the color value:

i.e. for a blue example:
red = 0, blue =1 , green = 0

Or, do any of the SVM packages intelligently handle this internally so 
that I don't have to mess with it.  If so, do I need to be concerned 
about different "translation" of the data if the test data set isn't 
exactly the same as the training set.

For example:
training data  =  color ("red, "blue", "green")
test data = color ("red, "green")

How would I be sure that the "red" and "green" examples get encoded the 
same so that the SVM is accurate?


Thanks in advance!!

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting Hints - how to add minor tics on axes

2009-08-12 Thread Jason Rupert
Using the standard plotting routine in R, i.e. no special packages, is there a 
way to add in minor tics to the axes?  

Also, is there a way to make sure the major axes labels are at the origin?  
When I'm looking at a plot, the major axes labels are present, but it looks 
like they start a bit away from the origin on the plot.  

Thanks again for any info and feedback.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Games in R

2009-08-12 Thread Greg Snow
Well there is the sudoku package on CRAN, but sorry it does not play 
tic-tac-toe or simulate flight.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of David Croll
> Sent: Wednesday, August 12, 2009 12:19 PM
> To: r-help@r-project.org
> Subject: [R] Games in R
> 
> Hi everybody - this is an oddball question.
> 
> 
> I wonder if anybody has programmed any games in R, such as Sudoku,
> Tic-Tac-Toe and the like. Or even a flight simulator...
> 
> 
> R mateys! Let's make some t-tests!
> 
> 
> Regards, David
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining the value of x at a given value of y in a smooth.spline object

2009-08-12 Thread Greg Snow
Part of the problem is that there could in theory be multiple x values that 
result in the same y value.

One approach if you are happy with something interactive rather than 
programatical is to use the TkSpline function in the TeachingDemos package to 
fit the spline function and drag the x-value until you find the y value that 
you want.

You can also look at the return from smooth.spline, find the y that is closest 
to your desired value and then find the corresponding x, or find the 2 y-values 
that bracket your choice and linearly interpolate the corresponding x values.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Kavitha Venkatesan
> Sent: Wednesday, August 12, 2009 11:43 AM
> To: r-help@r-project.org
> Subject: [R] Obtaining the value of x at a given value of y in a
> smooth.spline object
> 
> I have some data fit to a smooth.spline object as follows: (x=vector of
> data
> for the predictor variable, y=vector of data for the response variable)
> 
> fit <- smooth.spline(x,y)
> 
> Now, given a spline fit point y_new, I want to be able to find out what
> value of x_new yielded this fit value. How to do so?
> (This problem is the inverse of the predict.smooth.spline function,
> which
> takes x_new as input and yields the corresponding y_new fit value)
> 
> Any insight is much appreciated!
> 
> Thanks,
> Kavitha
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to label and unlabel points on scatterplot with mouse pointer

2009-08-12 Thread Greg Snow
Here is one approach that works on Windows (but not other platforms):

HWidentify <- function(x,y,label=seq_along(x), xlab=deparse(substitute(x)),
ylab=deparse(substitute(y)), ...) {

plot(x,y,xlab=xlab, ylab=ylab,...)

dx <- grconvertX(x,to='ndc')
dy <- grconvertY(y,to='ndc')

mm <- function(buttons, xx, yy) {
d <- (xx-dx)^2 + (yy-dy)^2
if ( all( d > .01 ) ){
plot(x,y,xlab=xlab,ylab=ylab,...)
return()
}
w <- which.min(d) 
plot(x,y,xlab=xlab,ylab=ylab,...)
points(x[w],y[w], cex=2, col='red')
text(grconvertX(xx,from='ndc'),grconvertY(yy,from='ndc'), 
label[w], col='green', adj=c(0,0))
return()
}

md <- function(buttons, xx, yy) {
if (any(buttons=='2')) return(1)
return()
}

getGraphicsEvent('Right Click to exit', onMouseMove = mm, 
onMouseDown=md) 
invisible()
}

tmpx <- runif(25)
tmpy <- rnorm(25)
HWidentify(tmpx,tmpy,LETTERS[1:25], pch=letters)

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Hitesh Singla
> Sent: Tuesday, August 11, 2009 10:50 PM
> To: r-help@r-project.org
> Subject: [R] How to label and unlabel points on scatterplot with mouse
> pointer
> 
> Dear all,
> 
> How can I label/unlabel points on scatterplot with mouse pointer. As
> the
> mouse approches near to point, it should label the closest point, then
> unlabel when it moves away.
> 
> How can I do in R? I be very thankful.
> 
> Thanks and Regards,
> 
> Hitesh Singla
> 
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Games in R

2009-08-12 Thread Bjørn Arild Mæland
Hi,

There's a couple of games listed on crantastic: http://crantastic.org/tags/games

-Bjorn

2009/8/12 David Croll :
> Hi everybody - this is an oddball question.
>
>
> I wonder if anybody has programmed any games in R, such as Sudoku,
> Tic-Tac-Toe and the like. Or even a flight simulator...
>
>
> R mateys! Let's make some t-tests!
>
>
> Regards, David
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem loading ncdf library on MAC

2009-08-12 Thread Steve Lianoglou

Hi,

On Aug 12, 2009, at 2:23 PM, Dan Kelley wrote:

I think ncdf is broken now, and so is Rnetcdf.  I can't build from  
source,

either.  If I find a solution, I'll post it here.  FYI, I have an
intel-based Mac running the latest OS and with R 2.9.1


While I haven't compiled it myself, it seems that Rnetcdf requires a  
library that you probably don't have installed. Check out it's build  
report:


http://www.r-project.org/nosvn/R.check/r-release-macosx-ix86/RNetCDF-00install.html

Probably it's looking for the udunits (http://www.unidata.ucar.edu/software/udunits/ 
 (?)) library and you don't have it.


I just tried to compile ncdf on my machine, and it also failed ..  
looks like you need netcdf.h. Just look at the status/reporting that  
building the package gives:



checking netcdf.h usability... no
checking netcdf.h presence... no
checking for netcdf.h... no
checking /usr/local/include/netcdf.h usability... no
checking /usr/local/include/netcdf.h presence... no
checking for /usr/local/include/netcdf.h... no
checking /usr/include/netcdf.h usability... no
checking /usr/include/netcdf.h presence... no
...

I guess you just need to get the required libs and try again.

Or are you getting different errors?

-steve


--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Attached file following download failure

2009-08-12 Thread Fowler, Mark

Hello,

I'm working with a package that uses download.file in functions to
extract information from remote databases. My current environment is
Windows XP Pro SP3, R 2.7. A full extraction can be a great deal of
data, so the download is accomplished in generally manageable packets,
such that a single download will result in many files, which are written
to a directory. It is not uncommon for a progressing download to fail
unexpectedly midstream (firewall issues, server crashes/reboots, etc).
When this occurs the last file remains attached (and empty), at least as
far is Windows is concerned. I need to delete it to properly resume
downloading. Windows won't let me do that unless I exit R, as it regards
the file as in use. And unlink won't do it, although it doesn't report
an error either. And if I try to unlink the folder containing the
problem file, it deletes all files in the folder except the attached
one, does not delete the folder, and again no message to indicate any
problem. Any way to release the file without exiting and restarting R?


> showConnections(all=TRUE)
  description class  mode text   isopen   can read can write
0 "stdin" "terminal" "r"  "text" "opened" "yes""no" 
1 "stdout""terminal" "w"  "text" "opened" "no" "yes"
2 "stderr""terminal" "w"  "text" "opened" "no" "yes"

> unlink("D:\\sharks\\KalmanFilter\\F34520\\AG2008072_2008074_sst.xyz")

#does nothing, no message (status 0)

> unlink("D:\\sharks\\KalmanFilter\\F34520",recursive=TRUE)

#does not delete the folder, deletes any files in the folder EXCEPT the
attached one, no message (status 0)

>   Mark Fowler
Population Ecology Division
>   Bedford Inst of Oceanography
>   Dept Fisheries & Oceans
>   Dartmouth NS Canada
B2Y 4A2
Tel. (902) 426-3529
Fax (902) 426-9710
Email fowl...@mar.dfo-mpo.gc.ca
Home Tel. (902) 461-0708
Home Email mark.fow...@ns.sympatico.ca



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] biOps load problem

2009-08-12 Thread Jonathan Lees


I am trying to install and use the biOps package on my x86_64 GNU/Linux
ssytem.
I have installed the fftw-3.2.2.tar.gz
and
biOps_0.2.1.tar.gz
but when I start R and try to load the library,
I get this message:

##
> library(biOps)
Error in dyn.load(file, DLLpath = DLLpath, ...) :
 unable to load shared library 
'/usr/local/lib64/R/library/biOps/libs/biOps.so':
 /usr/local/lib64/R/library/biOps/libs/biOps.so: undefined symbol: 
fftw_execute

Error: package/namespace load failed for 'biOps'
##

I am not clear what is missing -
the installation did not indicate any problem, or any missing
libraries.

Thanks for any help on this.


--
==
Prof. Jonathan M. Lees
Department of Geological Sciences
104 South Road, CB #3315, Mitchell Hall
University of North Carolina
Chapel Hill, NC  27599-3315
(919) 962-0695
FAX (919) 966-4519

jonathan_l...@unc.edu
http://www.unc.edu/~leesj

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Map of UK Counties - to use in R

2009-08-12 Thread Hisaji ONO
Hello.


 shapefiles in geographic coordinates for Epi Info
 http://www.cdc.gov/epiinfo/europe.htm second-level at
1998,  free available but  not public domain


 Regards.


--- Roger Bivand  wrote:

> 
> The illustration you show is for the so-called
> traditional or historical
> counties of England, which may be available
> somewhere. There are
> non-georeferenced PNG files on Wikipedia, which
> might be used, but as far as
> I can see, only UK-based academics can register for
> access to the edina UK
> borders datasets.
> 
> One possibility is to use the 2006 NUTS boundaries
> shapefile from
> GISCO/EUROSTAT at:
> 
>
http://epp.eurostat.ec.europa.eu/portal/page/portal/gisco/geodata/reference
> 
>
http://epp.eurostat.ec.europa.eu/cache/GISCO/geodatafiles/NUTS_03M_2006_SH.zip
> 
> and in R using something like:
> 
> library(rgdal)
> RG <- readOGR(".", "NUTS_RG_03M_2006")
> names(RG)
> UK <- grep("^UK", RG$NUTS_ID)
> RG_UK <- RG[UK,]
> plot(RG_UK, axes=TRUE)
> summary(RG_UK)
> 
> You'll then need to find the regions you want,
> possibly from:
> 
> http://www.statistics.gov.uk/geography/nuts.asp
> 
> so that you can retain only England, and choose the
> NUTS* boundaries that
> suit your "counties" - which are not presently
> well-defined because of
> boundary and administrative changes. The GISCO
> shapefile is in geographical
> coordinates, so you'll be able to overplot points by
> longitude and latitude.
> 
> Hope this helps,
> 
> Roger Bivand
> 
> 
> Raoul wrote:
> > 
> > Hi,
> > Can anyone help me with either of these:
> > 1) Map of the UK counties that I could use in R?
> > 2) How could I use an existing map for example, a
> map from here
> > http://www.itraveluk.co.uk/maps/england.html - in
> R. I need to use a UK
> > map to plot locations on it by lat & long.
> > 
> > Would appreciate help on any of these.
> > Thanks,
> > Raoul
> > 
> 
> -- 
> View this message in context:
>
http://www.nabble.com/Map-of-UK-Counties---to-use-in-R-tp24930435p24941090.html
> Sent from the R help mailing list archive at
> Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using bold font with bquote

2009-08-12 Thread Jonathan R. Blaufuss
Scott,
  Your suggestion works great for changing the numbers to bold font, but is it 
possible to change the sigma symbol and the equals sign to bold font as well? 
I've poked around ?plotmath and am I right in saying that there is a different 
method for controlling symbol fonts? 

Thank you for your help,

Jonathan

- Original Message -
From: "Scott Sherrill-Mix" 
To: "Jonathan R. Blaufuss" 
Cc: r-help@r-project.org
Sent: Wednesday, August 12, 2009 12:43:12 PM GMT -06:00 US/Canada Central
Subject: Re: [R] Using bold font with bquote

>From ?plotmath, it looks like when using expressions you set the font
inside the expression (e.g. bold(x)). It looks you tried this already
but I wonder if there was something tiny out of place since the
following works for me:

text(25000,0.3,bquote(bold(sigma==.(mySigma)),list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
col='blue')

Scott

Scott Sherrill-Mix
Department of Microbiology
University of Pennsylvania
402B Johnson Pavilion
3610 Hamilton Walk
Philadelphia, PA  19104-6076



On Wed, Aug 12, 2009 at 12:36 PM, Jonathan R.
Blaufuss wrote:
> I'm trying to annotate a density plot and I'm using bquote to paste the sigma 
> symbol next
> to the numeric text of the standard deviation calculation that I am 
> performing.
> I have been able to successfully turn the sigma symbol and numeric output the 
> color blue,
> but when I try to change the font of the text to bold, R doesn't seem to 
> recognize the "font="
> command in the same way here as it does with "col=". (My code is below)
>
>        set.seed(1)
>        Data=rnorm(100,sd=1)
>        plot(density(Data))
>        text(25000,0.3,
>                bquote(sigma==.(mySigma),
>                list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
>                col="blue")
>
> After searching the help files I've tried using the expression command with 
> "bold()" as well
> as inserting "font=2" after the color command. However, I can't seem to get 
> it to work.
>
> Can someone please point me to a resource that will help me figure this out?
>
> Thank You,
>
> Jonathan
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem loading ncdf library on MAC

2009-08-12 Thread Dan Kelley

I think ncdf is broken now, and so is Rnetcdf.  I can't build from source,
either.  If I find a solution, I'll post it here.  FYI, I have an
intel-based Mac running the latest OS and with R 2.9.1
-- 
View this message in context: 
http://www.nabble.com/problem-loading-ncdf-library-on-MAC-tp23195195p24942133.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Games in R

2009-08-12 Thread David Croll

Hi everybody - this is an oddball question.


I wonder if anybody has programmed any games in R, such as Sudoku, 
Tic-Tac-Toe and the like. Or even a flight simulator...



R mateys! Let's make some t-tests!


Regards, David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nominal to numeric function

2009-08-12 Thread Dimitris Rizopoulos

another approach may be:

foo <- c("blue", "red", "green")
s.foo <- sample(foo, 20, TRUE)
s.foo
match(s.foo, foo)


I hope it helps.

Best,
Dimitris


Noah Silverman wrote:

Hi,

Thanks for the tip,

Neither method works as my data is truly nominal

-
foo <- c("blue", "red", "green")
as.numeric(foo)
[1] NA NA NA
Warning message:
NAs introduced by coercion


Rapid Miner has a function that will automatically create an "index" of 
the values and create a new variable (or replace the existing).  It also 
has a second function that will break the nominal labels into n variable 
that are binary:


i.e.

red(0,1)
blue(0,1)
green(0,1)


Or,  I guess I could go back to the source that generates my data and 
institute a numeric key for the nominal items.  That seems like the 
longest way, but probably the safest to get what I want.


-N



On 8/12/09 9:10 AM, Phil Spector wrote:

It's generally safer to use

   as.numeric(as.character(variablename))

since it eliminates problems associated with factors.

   - Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 12 Aug 2009, Daniel Malter wrote:



Hi you can use newvariable=as.numeric(variablename). This converts your
factors into numeric variables, but not always with the desired 
result. So

make sure that you check whether "newvariable" gives you what you want.
Otherwise recoding by hand is indicated.

Best,
Daniel



Noah Silverman-3 wrote:


Hi,

I'm training an SVM (C-classification from e1071 library)

Some of the variables in my data set are nominal.  Is there some
easy/automatic way to convert them to numerical representations?

Thanks,

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/nominal-to-numeric-function-tp24930466p24939723.html 


Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nominal to numeric function

2009-08-12 Thread Noah Silverman

Hi,

Thanks for the tip,

Neither method works as my data is truly nominal

-
foo <- c("blue", "red", "green")
as.numeric(foo)
[1] NA NA NA
Warning message:
NAs introduced by coercion


Rapid Miner has a function that will automatically create an "index" of 
the values and create a new variable (or replace the existing).  It also 
has a second function that will break the nominal labels into n variable 
that are binary:


i.e.

red(0,1)
blue(0,1)
green(0,1)


Or,  I guess I could go back to the source that generates my data and 
institute a numeric key for the nominal items.  That seems like the 
longest way, but probably the safest to get what I want.


-N



On 8/12/09 9:10 AM, Phil Spector wrote:

It's generally safer to use

   as.numeric(as.character(variablename))

since it eliminates problems associated with factors.

   - Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 12 Aug 2009, Daniel Malter wrote:



Hi you can use newvariable=as.numeric(variablename). This converts your
factors into numeric variables, but not always with the desired 
result. So

make sure that you check whether "newvariable" gives you what you want.
Otherwise recoding by hand is indicated.

Best,
Daniel



Noah Silverman-3 wrote:


Hi,

I'm training an SVM (C-classification from e1071 library)

Some of the variables in my data set are nominal.  Is there some
easy/automatic way to convert them to numerical representations?

Thanks,

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/nominal-to-numeric-function-tp24930466p24939723.html 


Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using bold font with bquote

2009-08-12 Thread Scott Sherrill-Mix
>From ?plotmath, it looks like when using expressions you set the font
inside the expression (e.g. bold(x)). It looks you tried this already
but I wonder if there was something tiny out of place since the
following works for me:

text(25000,0.3,bquote(bold(sigma==.(mySigma)),list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
col='blue')

Scott

Scott Sherrill-Mix
Department of Microbiology
University of Pennsylvania
402B Johnson Pavilion
3610 Hamilton Walk
Philadelphia, PA  19104-6076



On Wed, Aug 12, 2009 at 12:36 PM, Jonathan R.
Blaufuss wrote:
> I'm trying to annotate a density plot and I'm using bquote to paste the sigma 
> symbol next
> to the numeric text of the standard deviation calculation that I am 
> performing.
> I have been able to successfully turn the sigma symbol and numeric output the 
> color blue,
> but when I try to change the font of the text to bold, R doesn't seem to 
> recognize the "font="
> command in the same way here as it does with "col=". (My code is below)
>
>        set.seed(1)
>        Data=rnorm(100,sd=1)
>        plot(density(Data))
>        text(25000,0.3,
>                bquote(sigma==.(mySigma),
>                list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
>                col="blue")
>
> After searching the help files I've tried using the expression command with 
> "bold()" as well
> as inserting "font=2" after the color command. However, I can't seem to get 
> it to work.
>
> Can someone please point me to a resource that will help me figure this out?
>
> Thank You,
>
> Jonathan
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R numeric string problem

2009-08-12 Thread Bert Gunter
You need to read up about finite precision arithmetic and floating point
representation. In brief, note that 2^64 requires 20 decimal digits, and
some bits in double precision must be given up to sign, exponent, etc.
leaving 53 bits for precision = 16 decimal digits. This is exactly the
number of digits in the numeric representation that "match" your string. All
other digits thereafter are essentially random numbers.

If you just need to keep the string as a string and not manipulate it as a
numeric, then read it in as a character variable, not a numeric. If you need
to manipulate it exactly as a numeric, check out Ryacas or some other
computer algebra package that is capable of infinite precision arithmetic.

Bert Gunter
Genentech Nonclinical Biostatisics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Andrew C
Sent: Wednesday, August 12, 2009 9:44 AM
To: r-help@r-project.org
Subject: [R] R numeric string problem


Hi,

I have a text (.dat) file, in which each row contains several long numeric
strings.  One of the strings is 38 digits long, for example:

03200801200801172008011720092904008901

When I read in the data file, this string shows up as 3.200801e+36.  To get
rid of the scientific notation, I used "options(scipen=999)."  When I did
this, the scientific notation went away, but the numeric string was
incorrect.  It showed as:

3200801200801172223262666846080882062

Why would the number be incorrect?  All of the other strings within this row
are correct.

Thanks,

Andrew


-- 
View this message in context:
http://www.nabble.com/R-numeric-string-problem-tp24940459p24940459.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Obtaining the value of x at a given value of y in a smooth.spline object

2009-08-12 Thread Kavitha Venkatesan
I have some data fit to a smooth.spline object as follows: (x=vector of data
for the predictor variable, y=vector of data for the response variable)

fit <- smooth.spline(x,y)

Now, given a spline fit point y_new, I want to be able to find out what
value of x_new yielded this fit value. How to do so?
(This problem is the inverse of the predict.smooth.spline function, which
takes x_new as input and yields the corresponding y_new fit value)

Any insight is much appreciated!

Thanks,
Kavitha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Map of UK Counties - to use in R

2009-08-12 Thread Roger Bivand

The illustration you show is for the so-called traditional or historical
counties of England, which may be available somewhere. There are
non-georeferenced PNG files on Wikipedia, which might be used, but as far as
I can see, only UK-based academics can register for access to the edina UK
borders datasets.

One possibility is to use the 2006 NUTS boundaries shapefile from
GISCO/EUROSTAT at:

http://epp.eurostat.ec.europa.eu/portal/page/portal/gisco/geodata/reference

http://epp.eurostat.ec.europa.eu/cache/GISCO/geodatafiles/NUTS_03M_2006_SH.zip

and in R using something like:

library(rgdal)
RG <- readOGR(".", "NUTS_RG_03M_2006")
names(RG)
UK <- grep("^UK", RG$NUTS_ID)
RG_UK <- RG[UK,]
plot(RG_UK, axes=TRUE)
summary(RG_UK)

You'll then need to find the regions you want, possibly from:

http://www.statistics.gov.uk/geography/nuts.asp

so that you can retain only England, and choose the NUTS* boundaries that
suit your "counties" - which are not presently well-defined because of
boundary and administrative changes. The GISCO shapefile is in geographical
coordinates, so you'll be able to overplot points by longitude and latitude.

Hope this helps,

Roger Bivand


Raoul wrote:
> 
> Hi,
> Can anyone help me with either of these:
> 1) Map of the UK counties that I could use in R?
> 2) How could I use an existing map for example, a map from here
> http://www.itraveluk.co.uk/maps/england.html - in R. I need to use a UK
> map to plot locations on it by lat & long.
> 
> Would appreciate help on any of these.
> Thanks,
> Raoul
> 

-- 
View this message in context: 
http://www.nabble.com/Map-of-UK-Counties---to-use-in-R-tp24930435p24941090.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R numeric string problem

2009-08-12 Thread Henrique Dallazuanna
Use colClasses argument in read.table to set the class of column:

For a file with two columns, where the first is string and the other is
numeric:

read.table('your_file.dat', colClasses = c('character', 'numeric'))

On Wed, Aug 12, 2009 at 1:43 PM, Andrew C  wrote:

>
> Hi,
>
> I have a text (.dat) file, in which each row contains several long numeric
> strings.  One of the strings is 38 digits long, for example:
>
> 03200801200801172008011720092904008901
>
> When I read in the data file, this string shows up as 3.200801e+36.  To get
> rid of the scientific notation, I used "options(scipen=999)."  When I did
> this, the scientific notation went away, but the numeric string was
> incorrect.  It showed as:
>
> 3200801200801172223262666846080882062
>
> Why would the number be incorrect?  All of the other strings within this
> row
> are correct.
>
> Thanks,
>
> Andrew
>
>
> --
> View this message in context:
> http://www.nabble.com/R-numeric-string-problem-tp24940459p24940459.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R numeric string problem

2009-08-12 Thread Andrew C

Hi,

I have a text (.dat) file, in which each row contains several long numeric
strings.  One of the strings is 38 digits long, for example:

03200801200801172008011720092904008901

When I read in the data file, this string shows up as 3.200801e+36.  To get
rid of the scientific notation, I used "options(scipen=999)."  When I did
this, the scientific notation went away, but the numeric string was
incorrect.  It showed as:

3200801200801172223262666846080882062

Why would the number be incorrect?  All of the other strings within this row
are correct.

Thanks,

Andrew


-- 
View this message in context: 
http://www.nabble.com/R-numeric-string-problem-tp24940459p24940459.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste first row string onto every string in column

2009-08-12 Thread Jill Hollenbach

Thanks so much everybody, this has been incredibly helpful--not only is my
immediate issue solved but I've learned a lot in the process. The lapply
solution is best for me, as I need flexibility to edit df's with varying
numbers of columns. 

Now, one more question: after appending the string from the first line, I am
manipulating the df further(recoding the original contents; this I have
working fine), and afterwards I will need to strip back off that string. It
seems relatively straightforward, except that, as shown in the example above
(df2), there is an astersik involved (I need to remove all characters up to
and including the asterisk) which seems problematic.
Any suggestions? 
Many thanks,
Jill



Don MacQueen wrote:
> 
> Let's start with something simple and relatively easy to understand, 
> since you're new to this.
> 
> First, here's an example of the core of the idea:
>>  paste('a',1:4)
> [1] "a 1" "a 2" "a 3" "a 4"
> 
> Make it a little closer to your situation:
>>  paste('a*',1:4, sep='')
> [1] "a*1" "a*2" "a*3" "a*4"
> 
> Sometimes it helps to save the number of rows in your dataframe in a 
> new variable
> 
> nr <- nrow(df)
> 
> Then, for your first column, the "a*" in the above example is df$V1[1]
> For the 1:4 in the example, you use  df$V1[ 2:nr]
> Put it together and you have:
> 
> dfnew <- df
> dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] )
> 
> But you can use "-1" instead of "2:nr", and you get
> 
>dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] )
> 
> That's how you can do it one column at a time.
> Since you have only four columns, just do the same thing to V2, V3, and
> V4.
> 
> But if you want a more general method, one that works no matter how 
> many columns you have, and no matter what they are named, then you 
> can use lapply() to loop over the columns. This is what Patrick 
> Connolly suggested, which is
> 
> as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = "")))
> 
> Note, though, that this will do it to all columns, so if you ever 
> happen to have a dataframe where you don't want to do all columns, 
> you'll have to be a little trickier with the lapply() solution.
> 
> -Don
> 
> At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:
>>Hi,
>>I am trying to edit a data frame such that the string in the first line is
>>appended onto the beginning of each element in the subsequent rows. The
data
>>looks like this:
>>
>>>  df
>>   V1   V2   V3   V4  
>>1   DPA1* DPA1* DPB1* DPB1*
>>2   0103 0104 0401 0601
>>3   0103 0103 0301 0402
>>.
>>.
>>  and what I want is this:
>>
>>>dfnew
>>   V1   V2   V3   V4  
>>1   DPA1* DPA1* DPB1* DPB1*
>>2   DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
>>3   DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402
>>
>>any help is much appreciated, I am new to this and struggling.
>>Jill
>>
>>___
>>  Jill Hollenbach, PhD, MPH
>> Assistant Staff Scientist
>> Center for Genetics
>> Children's Hospital Oakland Research Institute
>> jhollenb...@chori.org
>>
>>--
>>View this message in context: 
>>http://*www.*nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html
>>Sent from the R help mailing list archive at Nabble.com.
>>
>>__
>>R-help@r-project.org mailing list
>>https://*stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
http://*www.*R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
> 
> 
> -- 
> --
> Don MacQueen
> Environmental Protection Department
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
> 925-423-1062
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24939755.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting dates from a ts object

2009-08-12 Thread Gabor Grothendieck
time(obj) gives the times of a ts object, obj.

On Wed, Aug 12, 2009 at 12:51 PM, Data Analytics
Corp. wrote:
> Hi,
>
> I have a ts object called data.gas that has three economic variables:
> gasoline consumption (gas), price per gallon (price), and household income
> (income) by month from January 1970 to December 2008.  I want to plot this
> gas and price variables using ggplot2, but ggplot2 will not allow an object
> of class mts or ts; it has to be a data frame.  When I use
> as.data.frame(data.gas), the date information is lost.  How can I capture
> the date information in a vector so I can add it to a data frame created
> using as.data.frame?
>
> Thanks,
>
> Walt
>
>
>
> --
> 
>
> Walter R. Paczkowski, Ph.D.
> Data Analytics Corp.
> 44 Hamilton Lane
> Plainsboro, NJ 08536
> 
> (V) 609-936-8999
> (F) 609-936-3733
> dataanalyt...@earthlink.net
> www.dataanalyticscorp.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] getting dates from a ts object

2009-08-12 Thread Data Analytics Corp.

Hi,

I have a ts object called data.gas that has three economic variables: 
gasoline consumption (gas), price per gallon (price), and household 
income (income) by month from January 1970 to December 2008.  I want to 
plot this gas and price variables using ggplot2, but ggplot2 will not 
allow an object of class mts or ts; it has to be a data frame.  When I 
use as.data.frame(data.gas), the date information is lost.  How can I 
capture the date information in a vector so I can add it to a data frame 
created using as.data.frame?


Thanks,

Walt



--


Walter R. Paczkowski, Ph.D.
Data Analytics Corp.
44 Hamilton Lane
Plainsboro, NJ 08536

(V) 609-936-8999
(F) 609-936-3733
dataanalyt...@earthlink.net
www.dataanalyticscorp.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using bold font with bquote

2009-08-12 Thread Jonathan R. Blaufuss
I'm trying to annotate a density plot and I'm using bquote to paste the sigma 
symbol next
to the numeric text of the standard deviation calculation that I am performing.
I have been able to successfully turn the sigma symbol and numeric output the 
color blue,
but when I try to change the font of the text to bold, R doesn't seem to 
recognize the "font="
command in the same way here as it does with "col=". (My code is below)

set.seed(1)
Data=rnorm(100,sd=1)
plot(density(Data))
text(25000,0.3,
bquote(sigma==.(mySigma),
list('mySigma'=format(round(sd(Data),digits=3),big.mark=","))),
col="blue")

After searching the help files I've tried using the expression command with 
"bold()" as well
as inserting "font=2" after the color command. However, I can't seem to get it 
to work. 

Can someone please point me to a resource that will help me figure this out?

Thank You,

Jonathan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Superscripts in axis label

2009-08-12 Thread Eik Vettorazzi

oops, there was a little problem with an automated text replacement in
my last post. m^3 should be read as m ^ 3 (without white space).

or, using curly brackets for safety, try

plot(1,ylab=expression("Soil moisture
content"~~bgroup("(",m^{3}*m^{-3},")")))

hth.


Buckmaster, Sarah schrieb:

Thanks Eik, I just tried the following:

plot(sm~wf, type="n", xlab="Levels of droughting gradient", ylab=expression("Soil moisture 
content"~~bgroup("(",m^3*m^{-3},")"))), bty="l", font.main="2", pch=16, las=1, cex.lab="1.13")

and got error messages, saying 'unexpected string constant' - have I missed 
something?

Sarah


From: Eik Vettorazzi [e.vettora...@uke.uni-hamburg.de]
Sent: 12 August 2009 16:41
To: Buckmaster, Sarah
Cc: r-help@r-project.org
Subject: Re: [R] Superscripts in axis label

Hi Sarah,
expression works well, but you have to use it as function.

plot(1,ylab=expression("Soil moisture content
"~~bgroup("(",m^3*m^{-3},")")))

see
?plotmath
for further examples

hth.

Buckmaster, Sarah schrieb:
  

Hi All,

I am trying to lable the y-axis on my scatterplot with the following:
"Soil moisture content (m3m-3)"

I am using the following coding for plotting the graph:
plot(soilmoisture~gradientlevel, xlab="Levels of droughting gradient", ylab="Soil moisture content (m3m-3)", 
bty="l", font.main="2", pch=16, las=1, cex.lab="1.13")

I have tried to incorporate 'expression' into this coding for the ylab as 
follows (with also a few variations):
plot(soilmoisture~gradientlevel, xlab="Levels of droughting gradient", ylab=expression"Soil moisture content 
(m^3m^-3)", bty="l", font.main="2", pch=16, las=1, cex.lab="1.13")

...but this isn't working and the second m never seems to come up wth 
superscript 3 after it. I'm guessing I have tried to incorporate 'expression' 
too simply!

Would someone please advise me as to the coding for this?

Many thanks

Sarah

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790



--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus
  


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790




--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nominal to numeric function

2009-08-12 Thread Phil Spector

It's generally safer to use

   as.numeric(as.character(variablename))

since it eliminates problems associated with factors.

   - Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 12 Aug 2009, Daniel Malter wrote:



Hi you can use newvariable=as.numeric(variablename). This converts your
factors into numeric variables, but not always with the desired result. So
make sure that you check whether "newvariable" gives you what you want.
Otherwise recoding by hand is indicated.

Best,
Daniel



Noah Silverman-3 wrote:


Hi,

I'm training an SVM (C-classification from e1071 library)

Some of the variables in my data set are nominal.  Is there some
easy/automatic way to convert them to numerical representations?

Thanks,

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/nominal-to-numeric-function-tp24930466p24939723.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nominal to numeric function

2009-08-12 Thread Daniel Malter

Hi you can use newvariable=as.numeric(variablename). This converts your
factors into numeric variables, but not always with the desired result. So
make sure that you check whether "newvariable" gives you what you want.
Otherwise recoding by hand is indicated.

Best,
Daniel



Noah Silverman-3 wrote:
> 
> Hi,
> 
> I'm training an SVM (C-classification from e1071 library)
> 
> Some of the variables in my data set are nominal.  Is there some 
> easy/automatic way to convert them to numerical representations?
> 
> Thanks,
> 
> -N
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/nominal-to-numeric-function-tp24930466p24939723.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inserting into data frame gives "invalid factor level, NAs generated"

2009-08-12 Thread Scott Sherrill-Mix
Your running into the pretty common factor vs character problem in R.
By default data.frame turns character vectors into factor (sort of
like ENUM in mysql) vectors. Since you only have 1 factor (empty
string '') in your starting dataframe, when you go to insert new data
R sees a new value and complains. You'd probably be pretty safe using
character columns instead of factors for now (by adding
stringsAsFactors=FALSE to data.frame()) e.g.:

goframe<-data.frame(goA = character(10), goB = character(10), value
=numeric(10),stringsAsFactors=FALSE)

Scott

Scott Sherrill-Mix
Department of Microbiology
University of Pennsylvania
402B Johnson Pavilion
3610 Hamilton Walk
Philadelphia, PA  19104-6076



On Wed, Aug 12, 2009 at 11:49 AM,  wrote:
> I am calculating some values that I am inserting into a data frame. From
> what I have read, creating the dataframe ahead of time is more efficient,
> since rbind (so far the only solution I have found to appending to a data
> frame) is not very fast.
>
> What I am doing is the following:
>
> # create data frame
>
> goframe = data.frame(goA = character(10), goB = character(10), value =
> numeric(10))
> goframe[1,] = c("AA", "BB", 0.4)
>
> Result is:
>
>> goframe[1,] = c("AA", "BB", 0.4)
> Warning messages:
> 1: In `[<-.factor`(`*tmp*`, iseq, value = "AA") :
>  invalid factor level, NAs generated
> 2: In `[<-.factor`(`*tmp*`, iseq, value = "BB") :
>  invalid factor level, NAs generated
>>
>
> Is there another/better/more recomended way of doing this? If not, how do
> I do this without getting all the warnings?
>
> Thanks!
>
> Best,
>
> Karin Lagesen
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste first row string onto every string in column

2009-08-12 Thread Don MacQueen
Let's start with something simple and relatively easy to understand, 
since you're new to this.


First, here's an example of the core of the idea:

 paste('a',1:4)

[1] "a 1" "a 2" "a 3" "a 4"

Make it a little closer to your situation:

 paste('a*',1:4, sep='')

[1] "a*1" "a*2" "a*3" "a*4"

Sometimes it helps to save the number of rows in your dataframe in a 
new variable


nr <- nrow(df)

Then, for your first column, the "a*" in the above example is df$V1[1]
For the 1:4 in the example, you use  df$V1[ 2:nr]
Put it together and you have:

   dfnew <- df
   dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] )

But you can use "-1" instead of "2:nr", and you get

  dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] )

That's how you can do it one column at a time.
Since you have only four columns, just do the same thing to V2, V3, and V4.

But if you want a more general method, one that works no matter how 
many columns you have, and no matter what they are named, then you 
can use lapply() to loop over the columns. This is what Patrick 
Connolly suggested, which is


   as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = "")))

Note, though, that this will do it to all columns, so if you ever 
happen to have a dataframe where you don't want to do all columns, 
you'll have to be a little trickier with the lapply() solution.


-Don

At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:

Hi,
I am trying to edit a data frame such that the string in the first line is
appended onto the beginning of each element in the subsequent rows. The data
looks like this:


 df
  V1   V2   V3   V4  
1   DPA1* DPA1* DPB1* DPB1*

2   0103 0104 0401 0601
3   0103 0103 0301 0402
.
.
 and what I want is this:


dfnew
  V1   V2   V3   V4  
1   DPA1* DPA1* DPB1* DPB1*

2   DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
3   DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402

any help is much appreciated, I am new to this and struggling.
Jill

___
 Jill Hollenbach, PhD, MPH
Assistant Staff Scientist
Center for Genetics
Children's Hospital Oakland Research Institute
jhollenb...@chori.org

--
View this message in context: 
http://*www.*nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] inserting into data frame gives "invalid factor level, NAs generated"

2009-08-12 Thread karinlag
I am calculating some values that I am inserting into a data frame. From
what I have read, creating the dataframe ahead of time is more efficient,
since rbind (so far the only solution I have found to appending to a data
frame) is not very fast.

What I am doing is the following:

# create data frame

goframe = data.frame(goA = character(10), goB = character(10), value =
numeric(10))
goframe[1,] = c("AA", "BB", 0.4)

Result is:

> goframe[1,] = c("AA", "BB", 0.4)
Warning messages:
1: In `[<-.factor`(`*tmp*`, iseq, value = "AA") :
  invalid factor level, NAs generated
2: In `[<-.factor`(`*tmp*`, iseq, value = "BB") :
  invalid factor level, NAs generated
>

Is there another/better/more recomended way of doing this? If not, how do
I do this without getting all the warnings?

Thanks!

Best,

Karin Lagesen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] C-statistic comparison with partially paired datasets

2009-08-12 Thread Hanneke Wijnhoven

Frank,

Thank you for your quick response!

I want to compare the discriminative capacity of different 
anthropometric measures in predicting mortality, focussing on the "thin" 
site of these measures.
Since these associations are not linear (U shaped for BMI and inversily 
J-shaped for mid-upper arm circumference) and I do not want to include 
the prediction by "obesity", I am using all values below the median of 
each separate measure to calculate a C-statistic (below the median, the 
association is approximately linear).

As a result, some different and some overlapping cases are included.
I understand your point though.

Any suggestion is welcome.

Hanneke

Frank E Harrell Jr schreef:

Hanneke Wijnhoven wrote:
Does anyone know of an R-function or method to compare two 
C-statistics (Harrells's C - rcorr.cens) obtained from 2 different 
models in partially paired datasets (i.e. some similar and some 
different cases), with one continuous independent variable in each 
separate model? (in a survival analysis context)?
I have noticed that the rcorrp.cens function can be used for paired 
data.

  Thanks for any help,

Hanneke Wijnhoven



Hanneke,

I'm having trouble seeing how the unpaired observations can contribute 
information in general.  If for example all of the observations were 
unpaired, one C-statistic might be larger because it came from a 
dataset with more extreme observations that were easier to discriminate.


Frank




--
Hanneke A.H. Wijnhoven (PhD) 
Institute of Health Sciences 
Vrije Universiteit Amsterdam 
De Boelelaan 1085 
1081 HV Amsterdam 
The Netherlands 
Tel. +31 (0) 20 5989951 
Fax. +31 (0) 20 5986940 
hanneke.wijnho...@falw.vu.nl


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generating logistic regression data for specific ORs

2009-08-12 Thread Denis Aydin

Dear R-users

I want to generate data for a logistic regression for an epidemiological 
simulation.


First, I created a "disease-vector" containing a "1" if a subject is a 
cases (i.e. has the disease) and a "0" if a subject is a control. E.g.:



> disease <- as.factor( c(rep(1, n.cases), rep(0, n.controls)) )


Then, I want to generate two lognormally distributed exposure vectors, 
one for cases and one for controls.
The parameters of the distributions should be chosen in a way that a 
logistic regression model has a specific OR (or beta1) for the exposure. 
Something like that:



> exp.cases <- lnorm(n.cases, mean.cases, sd.cases)
> exp.contr <- lnorm(n.controls, mean.controls, sd.controls)
> exposure <- c(exp.cases, exp.controls)
> model <- glm(disease ~ exposure, family = binomial)


Unfortunately, I don't know how to generate the exposure vectors in a 
way that the logistic regression has a specific beta1 or OR.
In particular, I want the control over the parameters of the exposure 
distributions of cases and controls.


Could anyone help me on that?

Any help is appreciated.

Denis
--
Denis Aydin
Institute of Social and Preventive Medicine at Swiss Tropical Institute 
Basel

Associated Institute of the University of Basel
Steinengraben 49 – 4051 Basel – Switzerland
Phone: +41 (0)61 270 22 04
Fax:   +41 (0)61 270 22 25
denis.ay...@unibas.ch
www.ispm-unibasel.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Superscripts in axis label

2009-08-12 Thread Eik Vettorazzi

Hi Sarah,
expression works well, but you have to use it as function.

plot(1,ylab=expression("Soil moisture content 
"~~bgroup("(",m^3*m^{-3},")")))


see
?plotmath
for further examples

hth.

Buckmaster, Sarah schrieb:

Hi All,

I am trying to lable the y-axis on my scatterplot with the following:
"Soil moisture content (m3m-3)"

I am using the following coding for plotting the graph:
plot(soilmoisture~gradientlevel, xlab="Levels of droughting gradient", ylab="Soil moisture content (m3m-3)", 
bty="l", font.main="2", pch=16, las=1, cex.lab="1.13")

I have tried to incorporate 'expression' into this coding for the ylab as 
follows (with also a few variations):
plot(soilmoisture~gradientlevel, xlab="Levels of droughting gradient", ylab=expression"Soil moisture content 
(m^3m^-3)", bty="l", font.main="2", pch=16, las=1, cex.lab="1.13")

...but this isn't working and the second m never seems to come up wth 
superscript 3 after it. I'm guessing I have tried to incorporate 'expression' 
too simply!

Would someone please advise me as to the coding for this?

Many thanks

Sarah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790



--
Pflichtangaben gemäß Gesetz über elektronische Handelsregister und 
Genossenschaftsregister sowie das Unternehmensregister (EHUG):

Universitätsklinikum Hamburg-Eppendorf
Körperschaft des öffentlichen Rechts
Gerichtsstand: Hamburg

Vorstandsmitglieder:
Prof. Dr. Jörg F. Debatin (Vorsitzender)
Dr. Alexander Kirstein
Ricarda Klein
Prof. Dr. Dr. Uwe Koch-Gromus
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] axis scale

2009-08-12 Thread Don MacQueen
Without a short reproducible example your question is difficult to 
understand, and difficult to give a useful answer.

See comments below.

-Don

At 7:13 AM -0700 8/12/09, maram salem wrote:

Dear All,
I'm trying to plot a histogram (with the relative frequencies as the 
Y axis), But the scale of the y axis is given by

0e+00, 1e-04, 2e-04, 3e-04,.

Now, I have 2 questions
1- Does (1e-04=0.01831563)?


No, 1e-4 does not equal 0.01831563.
1e-4 = 0.0001

2- If this true,how can i change the given scale to 
(0.01,0.03,0.05,0.07,0.09)?


Since it's not true I suppose you don't need to change the scale?

If the relative frequency of some bin is 0.0001, why would you want 
to label it 0.01? That would be incorrect.


I find it somewhat difficult to produce a histogram, using hist(), 
with a relative frequency scale, and with relative frequencies on the 
order of 10 to the minus 4. This is another reason why a reproducible 
example is necessary.




--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help for R (Advanced matrix addition (large or undefined number of matrices)

2009-08-12 Thread Lina Rusyte
Dear Sirs,
 
I would like to ask you, what function can I use for matrices addition? I 
couldn’t find any information about it in the manual or in the internet.
(A+B suits, when the number of matrixes is small, function sum() doesn’t suit 
for matrices addition, because it sums all variables in the matrices and 
produces as an answer single number, not a matrix).
I would be very thankful for your help.
 
Best regards,
Lina 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SystemFit

2009-08-12 Thread Arne Henningsen
Hi Ferdogan,

Sorry for the late response.

On Thu, Jul 23, 2009 at 8:29 AM, Ferdogan wrote:
> I have two products which are substitudes. I try to fix a system as below to
> mydata.
>
> Demand1 = A1 -B1*Price1 + C1*Price2
> Demand2 = A2 +B2*Price1 - C2*Price2
>
> I would expect C1 & B2 to be symmetric, If they are truly substitude. How
> can I enforce this symmetry when creating a  system of equations via
> SystemFit ?

Please take a look at
   http://www.jstatsoft.org/v23/i04/
I suggest that you estimate the system without the restriction first
and then try to impose the restriction (and test it!). If you still do
not manage to estimate the (restricted) system, please don't hesitate
to send another email with a more precise/specific question to this
mailing list.

Best wishes,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Symbolic references - passing variable names into functions

2009-08-12 Thread Gabor Grothendieck
Returning the changed value as in Erik's answer is probably the
most common and R-like solution but here are two others.  The
assign/get approach is perhaps the closest to what you are asking
for.

# replacement function approach

# replacement function - rhs is formula whose response is assigned to
"dataf<-" <- function(data, value) {
v <- all.vars(value)
data[[v[1]]] <- data[[v[2]]]
data
}
BOD <- datasets::BOD
BOD
dataf(BOD) <- Time ~ demand
BOD

# assign/get approach

dataf <- function(data, col1, col2, env = parent.frame()) {
data.name <- deparse(substitute(data))
data <- get(data.name, env)
data[col1] <- data[col2]
assign(data.name, data, env)
}
BOD <- datasets::BOD
BOD
dataf(BOD, "Time", "demand")
BOD

On Wed, Aug 12, 2009 at 10:27 AM, Alexander Shenkin wrote:
> Hello All,
>
> I am trying to write a function which would operate on columns of a
> dataframe specified in parameters passed to that function.
>
>    f = function(dataf, col1 = "column1", col2 = "column2") {
>        dataf$col1 = dataf$col2 # just as an example
>    }
>
> The above, of course, does not work as intended.  In some languages one
> can force evaluation of a variable, and then use that evaluation as the
> variable name.  Thus,
>
>    > a = "myvar"
>    > (operator)a = 1
>    > myvar
>    [1] 1
>
> Is there some operator which allows this symbolic referencing in R?
>
> Thanks,
> Allie
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CCF for hourly time series?

2009-08-12 Thread Katharina Appel
Hi Andreas, thank you for replying so soon. You are right, your code works 
perfectly for my example. Unfortunately, the real data contains inner NAs, 
which I forgot to include. Sorry about that. Any other suggestions?

Thanks,
Katharina
 Original-Nachricht 
> Datum: Wed, 12 Aug 2009 16:29:26 +0200
> Von: Andreas Hary 
> An: Katharina Appel 
> Betreff: Re: [R] CCF for hourly time series?

> I have replaced na.pass by na.omit (along with some other, relatively
> minor
> changes) and it works for me:
> 
> x<-as.POSIXct(c("2008-12-25 16:00:00", "2008-12-25 17:00:00", "2008-12-25
> 18:00:00", "2008-12-25 19:00:00", "2008-12-25 20:00:00","2008-12-25
> 21:00:00"))
> y<-c(NA, 1.5,3,7, 1, 0.1)
> z<-c(0.3,0.35,0.7,0.72,0.72,0.8)
> x<-as.data.frame(x)
> prec<-cbind(x,y)
> theta<-cbind(x,z)
> ccf(ts(y),ts(z),lag.max=24,type='correlation',na.action=na.omit)
> 
> Hope it'll work for you too. Cheers,
> 
> Andreas
> 
> 
> 
> 
> 
> 
> 
> On Wed, Aug 12, 2009 at 4:15 PM, Katharina Appel  wrote:
> 
> > Hello,
> >
> > I have a dataframe containing various time series (not time series
> objects
> > though!)with hourly time steps. I´d like to perform ccf for I need to
> know
> > the correlation factors for different lags.
> > Here is an example:
> >
> > x<-as.POSIXct(c("2008-12-25 16:00:00", "2008-12-25 17:00:00",
> "2008-12-25
> > 18:00:00", "2008-12-25 19:00:00", "2008-12-25 20:00:00","2008-12-25
> > 21:00:00"))
> > y<-c(NA, 1.5,3,7, 1, 0.1)
> > z<-c(0.3,0.35,0.7,0.72,0.72,0.8)
> > x<-as.data.frame(x)
> > prec<-cbind(x,y)
> > theta<-cbind(x,z)
> >
> > 
> ccf(ts(mat1[,4]),ts(mat1[,j]),lag.max=24,type='correlation',na.action=na.pass)
> >
> >
> > that gives me "error in na.fail.default(as.ts(x)):missing values in
> object"
> > In order to create the data frame I already intersected different time
> > series so I know that the timeindex is the same for all rows and missing
> > values were replaced by NA.
> > Does that already impede creating a ts-object?
> > Is there another possibility to perform the cross-correlation?
> >
> > Thanks in advance,
> > Katharina
> > --
> > ___
> > Katharina Appel
> > Große Weinmeisterstraße 30
> > 14469 Potsdam
> >
> > Tel.: 0331-2370828
> > e-mail: kaj...@gmx.de
> >
> > Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3
> -
> > sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

-- 
___
Katharina Appel
Große Weinmeisterstraße 30
14469 Potsdam

Tel.: 0331-2370828
e-mail: kaj...@gmx.de

Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: override facet names in facet_wrap?

2009-08-12 Thread hadley wickham
That's on the to do list :(
Hadley

On Tue, Aug 11, 2009 at 10:00 PM, Ben Bolker wrote:
>
>
>  Thanks.  I can get it to work for facet_grid (which will do for my current
> purposes) but am curious about whether there's a way to do the same
> for facet_wrap (which doesn't have a "labeller" argument)?
>
>  cheers
>    Ben
>
>
> hadley wrote:
>>
>> Have a look at the code and examples of label_value and label_both.
>> They should suggest how to write your own labeller to do what you
>> want.
>>
>> Hadey
>>
>> On Tue, Aug 11, 2009 at 1:44 PM, Ben Bolker wrote:
>>>
>>>  just a quick question (to which I suspect the answer is "no"):
>>> does anyone know if, in the ggplot2 package, there's a way to
>>> override the default names of the facets in facet_wrap (which
>>> correspond to the levels of the factor used to facet)?  I know
>>> that I go back and change the levels of the factor, but it would
>>> be convenient to be able to supply a vector of level names at
>>> the time of plotting ...
>>>
>>>  cheers
>>>    Ben Bolker
>>>
>>>
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
>> --
>> http://had.co.nz/
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/ggplot2%3A-override-facet-names-in-facet_wrap--tp24923516p24929216.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Superscripts in axis label

2009-08-12 Thread Buckmaster, Sarah
Hi All,

I am trying to lable the y-axis on my scatterplot with the following:
"Soil moisture content (m3m-3)"

I am using the following coding for plotting the graph:
plot(soilmoisture~gradientlevel, xlab="Levels of droughting gradient", 
ylab="Soil moisture content (m3m-3)", bty="l", font.main="2", pch=16, las=1, 
cex.lab="1.13")

I have tried to incorporate 'expression' into this coding for the ylab as 
follows (with also a few variations):
plot(soilmoisture~gradientlevel, xlab="Levels of droughting gradient", 
ylab=expression"Soil moisture content (m^3m^-3)", bty="l", font.main="2", 
pch=16, las=1, cex.lab="1.13")

...but this isn't working and the second m never seems to come up wth 
superscript 3 after it. I'm guessing I have tried to incorporate 'expression' 
too simply!

Would someone please advise me as to the coding for this?

Many thanks

Sarah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Re : Re : Zoo and numeric data

2009-08-12 Thread Inchallah Yarab
Thank you Gavin very much for this explication!!!

inchallah yarab





De : Gavin Simpson 

Cc : r-help@r-project.org
Envoyé le : Mercredi, 12 Août 2009, 14h50mn 37s
Objet : Re: [R] Re : Zoo and numeric data

On Wed, 2009-08-12 at 10:03 +, Inchallah Yarab wrote:
> why you don't use read.csv2 (you save your file.csv) and you write
> read.csv2("path file",sep=",")

No you don't!!! Please understand what read.csv2 is for. It is for
locales where the "," is used as the decimal point, e.g. 5,2323 ==
5.2323. As such, you can't use the comma as a separator otherwise you'd
be splitting on all the decimal points.

>From ?read.csv2

    read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".",
              fill = TRUE, comment.char="", ...)

    read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",",
              fill = TRUE, comment.char="", ...)

So by setting sep = "," you are creating all sorts of trouble for
yourself. If you are in locale that uses "," as the decimal point, then
using read.csv2 with sep = "," will loose the decimal places if you
change sep to be ",".

Use the correct function for the job:

      * Use read.csv() if in a locale where CSV files are separated by
        "," and decimal point represented by ".".
      * Use read.csv2() if in a locale where decimal point is "," and
        CSV files are separated by ";".
      * If you have special requirements, use read.table() and set 'sep'
        and 'dec' etc as suits your data.

And anyway, read.zoo is just another wrapper around read.table to help
with loading time series data into zoo objects. There is nothing wrong
in using it and it has several benefits over reading data in and
converting to zoo separately.

G

> hope this helps
> 
> 
> 
> De : Mark Breman 
> À : r-h...@stat.math.ethz.ch
> Envoyé le : Mercredi, 12 Août 2009, 10h46mn 43s
> Objet : [R] Zoo and numeric data
> 
> Hi,
> I have a csv file with different datatypes:
> 
> 2009-01-01, character1, 10, 20.1
> 2009-01-02, character2, 11, 21.1
> 
> (I have attached the file to this post)
> 
> I read this file with read.zoo as I want a zoo/xts timeseries:
> > t = read.zoo("./data.txt", sep=",", dec = ".", header=FALSE)
> 
> If I look at the zoo data all integer/numeric columns are read as
> character:
> > str(t)
> ‘zoo’ series from 2009-01-01 to 2009-01-02
>  Data: chr [1:2, 1:3] " character1" " character2" "10" "11" "20.1" "21.1"
> - attr(*, "dimnames")=List of 2
>  ..$ : NULL
>  ..$ : chr [1:3] "V2" "V3" "V4"
>  Index: Class 'Date'  num [1:2] 14245 14246
> 
> So I try the colClasses parameter with read.zoo but it looks like this does
> not make any difference:
> > t1 = read.zoo("./data.txt", sep=",", dec = ".", header=FALSE,
> colClasses=c("Date", "character", "integer", "numeric"))
> > str(t1)
> ‘zoo’ series from 2009-01-01 to 2009-01-02
>  Data: chr [1:2, 1:3] " character1" " character2" "10" "11" "20.1" "21.1"
> - attr(*, "dimnames")=List of 2
>  ..$ : NULL
>  ..$ : chr [1:3] "V2" "V3" "V4"
>  Index: Class 'Date'  num [1:2] 14245 14246
> 
> Why does read.zoo ignores the colClasses parameter and how do I get
> integer/numeric data into my zoo series?
> 
> Regards,
> 
> -Mark-
> 
> 
> 
>      
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson            [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,            [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Symbolic references - passing variable names into functions

2009-08-12 Thread Erik Iverson
I think ONE answer to what you actually want to do might be

f <- function(dataf, col1 = "column1", col2 = "column2") {
dataf[[col1]] <- dataf[[col2]] # just as an example
  dataf
}

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Alexander Shenkin
Sent: Wednesday, August 12, 2009 9:27 AM
To: r-help@r-project.org
Subject: [R] Symbolic references - passing variable names into functions

Hello All,

I am trying to write a function which would operate on columns of a
dataframe specified in parameters passed to that function.

f = function(dataf, col1 = "column1", col2 = "column2") {
dataf$col1 = dataf$col2 # just as an example
}

The above, of course, does not work as intended.  In some languages one
can force evaluation of a variable, and then use that evaluation as the
variable name.  Thus,

> a = "myvar"
> (operator)a = 1
> myvar
[1] 1

Is there some operator which allows this symbolic referencing in R?

Thanks,
Allie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste first row string onto every string in column

2009-08-12 Thread Benilton Carvalho
you could stick everything in a 1-liner, but that would make it less  
readable:


myf <- function(x){
  tmp <- as.character(x)
  c(tmp[1], paste(tmp[1], tmp[-1], sep=""))
}
df2 <- as.data.frame(sapply(df, myf))


b


On Aug 12, 2009, at 3:39 AM, milton ruser wrote:


Hi Jill,

Completely not elegant, but may be usefull.
Of course other colleagues will solve this with 1 line command :-)

cheers

milton


df<-read.table(stdin(), head=T, sep=",")
V1,V2,V3,V4
DPA1*,DPA1*,DPB1*,DPB1*
0103,0104,0401,0601
0103,0103,0301,0402

df.new<-as.matrix(df)
for (i in 2:dim(df)[1])
{
for (j in 1:dim(df)[2])
 {
 df.new[i,j]<-paste(c(as.character(df[1,j])),  
c(as.character(df[i,j])),

sep="")
 }
}
df.new<-data.frame(df.new)
df
df.new



On Tue, Aug 11, 2009 at 9:48 PM, Jill Hollenbach >wrote:




Hi,
I am trying to edit a data frame such that the string in the first  
line is
appended onto the beginning of each element in the subsequent rows.  
The

data
looks like this:


df

V1   V2   V3   V4
1   DPA1* DPA1* DPB1* DPB1*
2   0103 0104 0401 0601
3   0103 0103 0301 0402
.
.
and what I want is this:


dfnew

V1   V2   V3   V4
1   DPA1* DPA1* DPB1* DPB1*
2   DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
3   DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402

any help is much appreciated, I am new to this and struggling.
Jill

___
Jill Hollenbach, PhD, MPH
  Assistant Staff Scientist
  Center for Genetics
  Children's Hospital Oakland Research Institute
  jhollenb...@chori.org

--
View this message in context:
http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Symbolic references - passing variable names into functions

2009-08-12 Thread Alexander Shenkin
Hello All,

I am trying to write a function which would operate on columns of a
dataframe specified in parameters passed to that function.

f = function(dataf, col1 = "column1", col2 = "column2") {
dataf$col1 = dataf$col2 # just as an example
}

The above, of course, does not work as intended.  In some languages one
can force evaluation of a variable, and then use that evaluation as the
variable name.  Thus,

> a = "myvar"
> (operator)a = 1
> myvar
[1] 1

Is there some operator which allows this symbolic referencing in R?

Thanks,
Allie

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >