Re: [R] sum(hist$density) == 2 ?!

2012-03-13 Thread Jeff Newmiller
Your clue is... density!

Probability density is not the same as probability... you have to multiply it 
by something before you can sum it.

Try typing

h

by itself and review your options.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Sam Steingold  wrote:

>> x <- rnorm(1000)
>> h <- hist(x,plot=FALSE)
>> sum(h$density)
>[1] 2 - shouldn't it be 1?!
>
>> h <- hist(x,plot=FALSE, breaks=(-4:4))
>> sum(h$density)
>[1] 1  - now it's 1. why?!
>
>
>
>-- 
>Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X
>11.0.11004000
>http://www.childpsy.net/ http://www.memritv.org
>http://openvotingconsortium.org
>http://thereligionofpeace.com http://mideasttruth.com
>http://palestinefacts.org
>((lambda (x) `(,x ',x)) '(lambda (x) `(,x ',x)))
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sum(hist$density) == 2 ?!

2012-03-13 Thread Sam Steingold
> x <- rnorm(1000)
> h <- hist(x,plot=FALSE)
> sum(h$density)
[1] 2 - shouldn't it be 1?!

> h <- hist(x,plot=FALSE, breaks=(-4:4))
> sum(h$density)
[1] 1  - now it's 1. why?!



-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://www.memritv.org http://openvotingconsortium.org
http://thereligionofpeace.com http://mideasttruth.com http://palestinefacts.org
((lambda (x) `(,x ',x)) '(lambda (x) `(,x ',x)))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualising multiple response contingency tables

2012-03-13 Thread ilai
Not sure I understand your question (or if there is one) and I am not
familiar with vcd::mosaic. But if you are asking is there a simpler
way ? than yes:
1. work with ?array and ?aperm
2. create the array directly in R from the original data - not excel
3. ?mosaicplot (no package required - it's in grid)

Here is what I mean based on your f.tbl:

>> f.tbl = structure(c(10, 15, 25, 45, 30, 50), .Dim = 2:3, .Dimnames = 
>> structure(list(Sex = c("F", "M"), Responses = c("A", "B", "total 
>> subjects")), .Names = c("Sex", "Responses")), class = "table")

# Calculate the No-A No-B columns:
(ff.tbl <- rbind(f.tbl[,1:2],f.tbl[,3]-f.tbl[,1:2]))
# rearrange to a CxRxB (in this case 2x2x2) array:
dim(ff.tbl) <- c(2,2,2)
# give some names
 dimnames(ff.tbl) <- list(Sex=c('F','M'),c('yes','no'),Response=c('A','B'))
ff.tbl
# plot
 mosaicplot(ff.tbl)
# or plot
mosaicplot(aperm(ff.tbl,3:1))
# or test
apply(ff.tbl, 3 , chisq.test) # and sum the result


Hope this helps get you started


> f.tbl   Responses


> Sex  A  B total subjects
>  F 10 25             30
>  M 15 45             50
>
>
> The answer I have is to adjust my data and then use the mosaic() function
> in package:vcd; however, I'm not sure that's the best way forward and I
> don't have a very efficient way of getting there. I will present my
> solution so you guys can take a look.
>
> The fundamental problem is that because of the multiple response data, you
> can't simply apply a normal Chi-square test to the contingency table.
> There's a raft of approaches, but I've decided to use a simple technique
> introduced by (A. Agresti, I. Liu, Modeling a categorical variable allowing
> arbitrarily many category choices, Biometrics 55 (1999) 936-43.) and
> refined by Thomas and Decady and Bilder and Loughin. In summary, the test
> statistic (a modified Chi square statistic) is calculated by summing up the
> individual chi-square statistics for each of the c marginal r в 2 tables
> relating the single response variable to the multiple response variable
> with df = c(r - 1)). Note, that instead of using the row totals (total
> number of responses) the test statistic is calculated with the total number
> of subjects per row.
>
> (phew, I hope that made sense :) ) Unfortunately, my google-research has
> not revealed an easy way to transform my one data table into c x r x 2
> tables for analysis. So I end up having to create the two different tables
> myself, shown below (note that the Not-A/B columns are calculated as the
> difference between the main data column (A/B) and the total number of
> subjects listed above.
>
>> g.mtrx=matrix(c(10,15,20,35),nrow=2)> g.tbl=as.table(g.mtrx)> 
>> dimnames(g.tbl)=list(Sex=c("F","M"),Responses=c("A","Not-A"))> g.tbl   
>> Responses
> Sex  A  Not-A
>  F  10     20
>  M  15     35
>
>> h.tbl=as.table(h.mtrx)> h.mtrx=matrix(c(25,45,5,5),nrow=2)> 
>> h.tbl=as.table(h.mtrx)> 
>> dimnames(h.tbl)=list(Sex=c("F","M"),Responses=c("B","Not-B"))> h.tbl   
>> Responses
> Sex  B Not-B
>  F 25     5
>  M 45     5
>
>
> If I then preform the normal Chi-square test on each of the two tables
> (chisq.test()) and then sum up the results, I get the answer I want.
> Clearly this is cumbersome, which is why I do it in Excel at the moment (I
> know shame on me). However, I really want to take advantage of the mosaic
> function in vcd. So what I have to do at the moment is create the tables
> above and use abind() (package:abind) to bring my two matrices together to
> form a multidimensional matrix. Example:
>
>> gh.abind = abind(g.mtrx,h.mtrx,along=3)> 
>> dimnames(gh.abind)=list(Sex=c("F","M"),Responses=c("Yes","No"),Factors=c("A","B"))>
>>  gh.abind, , Factors = A
>
>   Responses
> Sex Yes No
>  F  10 20
>  M  15 35
>
> , , Factors = B
>
>   Responses
> Sex Yes No
>  F  25  5
>  M  45  5
>
> Now I can use the simple mosaic function to plot the combined matrix
>
>> mosaic(gh.abind)
>
> So that's it. I don't use any pearson-r shading in mosaic since I
> don't think it would be appropriate to try and model my weird multiple
> response tables (at the moment), but what I will do is look at the
> odds-ratio table and then manually colour the mosaic cells with high
> odds-ratios (greater than 2).
>
> I am literally having to type all this by hand into R, and as you can
> imagine, it gets cumbersome with large multi column tables (which I
> have). Does any body have any thoughts on my approach of using mosaic
> for this sort of data? And if so, any insight on how I can be a bit
> slicker with my R code?
>
> All help is appreciated and I hope that this question wasn't too long
> to read through.Not sure I uderstand your question (or if there is one) and I 
> am not familiar with vcd::mosaic. But if you are asking is there a simpler 
> way ? than yes:
1. work with ?array and ?aperm not tables
2. create the array directly in R from the original data
3. ?mosaicplot (no package required - it's in grid)

Here is what I mean based on your f.tbl:
>> f.tbl = structu

Re: [R] multi-histogram plotting

2012-03-13 Thread Sam Steingold
> * David Winsemius  [2012-03-13 17:53:14 -0400]:
> On Mar 13, 2012, at 5:33 PM, Sam Steingold wrote:
>> I can, of course, plot log(h$density), but then the number labels will
>> be wrong.
>
> You could try apply a log transform to the appropriate component of
> the "h" object and using barplot to display the results.

that's what I said above: "plot log(h$density)".
However, the ordinate will be labeled with log values, not the original
values. how do I get the log ticks on the ordinate?


-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://jihadwatch.org http://www.memritv.org
http://dhimmi.com http://memri.org http://pmw.org.il http://truepeace.org
Profanity is the one language all programmers know best.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] twitteR package -- geocode

2012-03-13 Thread z2.0
Question: 

twitteR's searchTwitter() function contains a 'geocode' argument that
returns tweets from users whose location falls within a given radius. 

I'm not completely familiar with the API from which twitteR pulls, but no
mechanism exists to extract location coordinates from the tweets themselves,
correct? 

That is, the best we can do is identify the user-provided location of the
tweeting user, not his/her location at the time of tweeting. Yes?

Thanks,

Zack

--
View this message in context: 
http://r.789695.n4.nabble.com/twitteR-package-geocode-tp4470702p4470702.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a .pdf file within a function - what do I need to return()?

2012-03-13 Thread R. Michael Weylandt
See R FAQ 7.22 -- in short, you need to print() your plot to the
graphics device -- just wrap xyplot() in print() and it should work.

Michael

On Tue, Mar 13, 2012 at 3:55 PM, Dgnn  wrote:
> I am trying to write a function that generates one PDf containing plots from
> several .csv files within a directory.  When I manually execute the code it
> seems to work, but not when it is a function. I think I need to return()
> something, but haven't had much luck figuring out what/how.
>
> plot.isi<-function(csv.path="~/project/csv by cell") {
>        csv.files<-grep('.csv', list.files(path = csv.path, full.names=T), 
> value=T)
>        pdf(file='plots/isi plots.pdf', width=10, height=8)
>        #par(mfrow=c(2,1)) #ideally 2 plots per page, but will work on details
> after fx. works
>        for (i in 1:length(csv.files)){
>                raw.df<-read.csv(csv.files[i])
>                names(raw.df)<-c('t','isi','logic','cond')
>                xyplot(isi ~ t, raw.df, ylim=c(0,1500), ylab='isi', 
> xlab='time',
>                                main=basename(csv.files[i]))
>        }
>        dev.off()
> }
>
> Thank you all for the help,
>
> Jason Deignan
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Writing-a-pdf-file-within-a-function-what-do-I-need-to-return-tp4470165p4470165.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rolling regressions with sample extended one period at a time

2012-03-13 Thread R. Michael Weylandt
Perhaps zoo::rollapply from the zoo package can get you started.

Michael

On Tue, Mar 13, 2012 at 4:49 PM, pie'  wrote:
> Hi,
>
> I would like suggestions as to how to perform rolling regressions with the
> window extended one period at a time. That is, an initial sample period is
> passed to estimation and that very sample is then extended one period at a
> time through the remaining sample. Is there a specific package?
>
> Thnks.
>
> P.
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Rolling-regressions-with-sample-extended-one-period-at-a-time-tp4470316p4470316.html
> Sent from the R help mailing list archive at Nabble.com.
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MANOVA and Extra Sums-of-Squares Tests

2012-03-13 Thread John Fox
Dear chris33,

Well, actually as I said, the anova() function *will* do what you want. You can 
fit multivariate linear models with lm(),

mod.1 <- lm(cbind(Y1, Y2, Y3, Y4, Y5) ~ X1*X2 +X1*X3 + X1*X4)
mod.2 <- lm(cbind(Y1, Y2, Y3, Y4, Y5) ~ X1 + X2 + X3 + X4)

and then use anova() to get multivariate tests,

anova(mod.1, mod.2)

See ?anova.mlm for more information.

Best,
 John

On Tue, 13 Mar 2012 14:49:01 -0700 (PDT)
 chris33  wrote:
> Hi John,
> 
> Thanks for your response.  The anova funtion will not work in my case,
> because I have multiple response variables.  In other words, I would like to
> conduct an extra sums-of-squares and cross-products test between the
> following models:
> 
> FULL.MODEL:   (Y1, Y2, Y3, Y4, Y5) as a function of  X1 + X2 + X3 + X4 +
> X1*X2 +X1*X3 + X1*X4
> REDUCED.MODEL:   (Y1, Y2, Y3, Y4, Y5) as a function of X1 + X2 + X3 + X4 
> 
> So, I suppose that I would need to calculate the residual sum-of-squares and
> cross-product matrices for each of these models as a start.  Any ideas how I
> would go about this in R?  Thanks again,
> 
> Chris   
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/MANOVA-and-Extra-Sums-of-Squares-Tests-tp4470077p4470459.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amount of memory occupied by data type

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 7:02 PM, Folkes, Michael wrote:


Hello all,
I was under the (false?) assumption that an object that is class
logical, would take up less memory than an object with class integer.


Nope.


Below am I correctly showing this is not the case?

This was an attempt to reduce memory usage.


I think there is a package that will do bitwise operations. Yep... all  
we needed to do is look:


http://finzi.psych.upenn.edu/R/library/bitops/html/00Index.html



I'm dealing with two large
arrays (could be integers).  Their contents are the exact same, but  
one

has NA's in random locations.  I thought instead of having the second
array as an integer, it could be logical and the TRUE vs FALSE could  
be
used to update data in the first array.  (but even this idea may be  
weak

if I just end up with a third temporary array...)


You probably would since any assignment is going to create a copy. And  
even having a bitwise logical option wouldn't necessarily help since  
the indexing would be of necessity either integer or logical (both 8  
bit values).


I'm running win xp sp3, "R version 2.14.1 (2011-12-22)".


 31-bit addressing constraints as well?  (That's so last decade.) You  
aren't making life easy for yourself are you.



--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MANOVA and Extra Sums-of-Squares Tests

2012-03-13 Thread chris33
Hi John,

Thanks for your response.  The anova funtion will not work in my case,
because I have multiple response variables.  In other words, I would like to
conduct an extra sums-of-squares and cross-products test between the
following models:

FULL.MODEL:   (Y1, Y2, Y3, Y4, Y5) as a function of  X1 + X2 + X3 + X4 +
X1*X2 +X1*X3 + X1*X4
REDUCED.MODEL:   (Y1, Y2, Y3, Y4, Y5) as a function of X1 + X2 + X3 + X4 

So, I suppose that I would need to calculate the residual sum-of-squares and
cross-product matrices for each of these models as a start.  Any ideas how I
would go about this in R?  Thanks again,

Chris   

--
View this message in context: 
http://r.789695.n4.nabble.com/MANOVA-and-Extra-Sums-of-Squares-Tests-tp4470077p4470459.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to write crossed and nested random effects in a model

2012-03-13 Thread Ben Bolker
Niroshan  ucalgary.ca> writes:

> I have a question based on my research. I am analyzing reader-based
> diagnostic data set.  My study involves diabetic patients who were evaluated
> for treatable diabetic retinopathy based on the presence or absence of two
> pathologies in their eyes.  Pathologies were identified using the clinical
> examination (Gold standard method). In addition it can be identified by
> taking digital images of patients’ eyes and this method is cost effective.
> Finally two readers go over the images independently and patients are
> diagnosed as either positive or negative for the pathologies.
> My objective is, estimation the sensitivity and specificity of reader-based
> diagnostic method.
> 
> I am going to fit multivariate probit model. But the problem has complex
> correlation structure. We have three different correlation: readers results
> are correlated, patients left and right eyes are correlated and pathologies
> are correlated since all based on the retina in the eye.
> 
> Could anyone help me out how to address these correlations in a model using
> random effects? 
> 
> Also I think patients and readers are crossed each other since each reader
> go over each patients’ images. And [snip] eyes are nested with patients and
> pathologies are nested with in the eye.  Is this crossed and nested
> interpretation true?  If then how can I include these effects as random
> terms to the model?
> 
> My response is readers ‘ diagnosed values. Per patient I have 8 values (2
> pathologies, left and right eye and 2 readers) 
> Explanatory variables are actual disease status of each pathology for left
> and right eyes.
> 


   I think that *in principle* (if you are using lme4, which is
probably the most convenient option for dealing with crossed REs) you
probably want

 ~ pathology + (pathology|reader)+(pathology|patient/eye)

  The fixed effect term says that pathologies may vary in their
overall frequency.  The first RE term says that different readers can
vary, in a pathology-specific way (if they just differed overall in
their sensitivity you would want (1|reader) instead); the second says
that there is variance among eyes (within patients) in all pathologies
(and that they may be correlated).

  A few cautions about this:

* I'm not sure I got it right

* You might want to forward this (along with my answer, so we're not
starting from scratch) to r-sig-mixed-mod...@r-project.org , where
there is more expertise in mixed models.

* if you have the _same_ two readers for all of your patients (as
opposed to two different readers chosen at random out of a large,
possibly overlapping pool), then it isn't be practical to treat them
as a random effect, no matter how much sense it makes philosophically
-- use pathology*reader instead.

* You may need a moderately large amount of data to fit this model ...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Amount of memory occupied by data type

2012-03-13 Thread Folkes, Michael
Hello all,
I was under the (false?) assumption that an object that is class
logical, would take up less memory than an object with class integer.
Below am I correctly showing this is not the case?

This was an attempt to reduce memory usage.  I'm dealing with two large
arrays (could be integers).  Their contents are the exact same, but one
has NA's in random locations.  I thought instead of having the second
array as an integer, it could be logical and the TRUE vs FALSE could be
used to update data in the first array.  (but even this idea may be weak
if I just end up with a third temporary array...)

I'm running win xp sp3, "R version 2.14.1 (2011-12-22)".
Thanks very much.
Michael

__
arr<-array(1:60,c(3,4,5))
arr2<-array(TRUE,c(3,4,5))

mode(arr)
class(arr)
storage.mode(arr)

mode(arr2)
class(arr2)
storage.mode(arr2)

object.size(arr)
object.size(arr2)

__
Michael Folkes
Salmon Stock Assessment
Canadian Dept. of Fisheries & Oceans 
Pacific Biological Station
3190 Hammond Bay Rd.
Nanaimo, B.C., Canada
V9T-6N7
Ph (250) 756-7264 Fax (250) 756-7053  michael.fol...@pac.dfo-mpo.gc.ca



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sunflower plot, making vectors?

2012-03-13 Thread R. Michael Weylandt
I don't know much about sunflowerplots, but perhaps hexagonal binning
might make be worth a look  in your case if you are generally looking
for a scatterplot but the point density is too high.

http://cran.r-project.org/web/packages/hexbin/index.html

Take a look at the vignettes. You can also hexbin in ggplot2:
http://had.co.nz/ggplot2/stat_binhex.html

Hope this helps,
Michael

On Tue, Mar 13, 2012 at 5:59 PM, Henry  wrote:
> I'm having a bit of trouble finding and understanding the correct function to
> make numeric vectors to feed the sunflowerplot function.  I have 33k points
> to show and I want to do better than the standard scatter plot.
> I gather that I need two vectors (x and y) of the same count containing the
> "center" value of each bin.
> FYI - I have two pieces of data:
> 1 - x axis - time in days of the year from 0 to 363.98
> 2 - y axis -  power in kW 54.95 to 461.1
>
> e.g. let's say I want 100 bins in each axis, what function(s) do I use.
> Average the head and tail of each bin?
> I'm sure there are several ways to do this.
>
> After this I'll try to fill in the other sunflowerplot function inputs.
> Thanks,
> -Henry
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/sunflower-plot-making-vectors-tp4470486p4470486.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sunflower plot, making vectors?

2012-03-13 Thread Henry
I'm having a bit of trouble finding and understanding the correct function to
make numeric vectors to feed the sunflowerplot function.  I have 33k points
to show and I want to do better than the standard scatter plot.
I gather that I need two vectors (x and y) of the same count containing the
"center" value of each bin.
FYI - I have two pieces of data:
1 - x axis - time in days of the year from 0 to 363.98
2 - y axis -  power in kW 54.95 to 461.1

e.g. let's say I want 100 bins in each axis, what function(s) do I use.
Average the head and tail of each bin?
I'm sure there are several ways to do this.

After this I'll try to fill in the other sunflowerplot function inputs.
Thanks,
-Henry

--
View this message in context: 
http://r.789695.n4.nabble.com/sunflower-plot-making-vectors-tp4470486p4470486.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D Black-Scholes Graph Help!

2012-03-13 Thread ilai
On Tue, Mar 13, 2012 at 3:34 PM, David Winsemius  wrote:
>
> When I got around to running it I was hampered by a lack of knowledge about
> what sort of data-object "price" might have been. I tried putting in a
> single number on hte theory that it would saitisfy the seq() call, and also
>  got the error you report. More input is needed from the OP about the
> problem specification, and hopefully she will provide a test dataset.

It is just a number. The OP function works fine, the error was
generated by wireframe because of the "partial" data passed to formula
- OptionPrice is a matrix, not a column in the grid data.frame which
only holds the scales.

Anna, Replace call to wireframe in your function with

wireframe(OptionPrice,main="3D Option", drape=T, col.regions=heat.colors(100),
scales = 
list(arrows=F,x=list(at=1:length(t),labels=t),y=list(at=1:length(s),labels=s)))

Then:
plotbs(16)

Note, the first value is always NaN but that's from your calculations
- look at the OptionPrice matrix.

Cheers
Elai



> --
> David.
>
>
>>
>> Berend
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multi-histogram plotting

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 5:33 PM, Sam Steingold wrote:


I have a vector x:
table(x)

   2 3 4 5 6 7 8 9101112 
1314
45547 11835  4692  2241  1386   820   593   425   298   239   176
158   115
  1516171819202122232425 
2627
  9488766747464020302220 
3314
  2829303132333435363738 
3940
  2010121011 8 9 8 9 9 8  
7 4
  4142434445464748495051 
5253
   5 4 6 5 2 5 4 4 3 1 6  
4 3
  5455565758596061636465 
6667
   2 2 1 4 5 2 5 1 3 2 1  
1 4
  7172757879828384868890 
9293
   2 3 1 2 2 2 2 1 2 2 1  
1 1
  949596979899   100   106   109   110   111
112   119
   1 2 1 1 3 1 1 3 1 2 2  
1 1
 122   125   126   128   132   133   135   140   143   147   148
157   162
   1 1 1 1 1 1 1 1 1 1 1  
1 1
 165   166   167   169   174   176   193   197   201   208   224
236   339
   1 1 1 1 1 1 1 1 1 1 1  
1 1
 350   390   391   410   421   447   450   453   479   512   608
679   754
   1 1 1 1 1 1 1 1 1 1 1  
1 1

 774   788   956   961  9597 14821
   1 1 1 1 1 1

I want to plot its histogram; moreover,
I want to plot histograms of its several subsets on the same device
(plot+lines+lines+... - right?)
This means that I need a freq=FALSE plots.
Now, a simple hist(x) produces a useless plot (vertical bar +  
horizontal

bar, which could be inferred from the table above).
So, I want
1. a logarithmic vertical scale
2. an "interesting" horizontal scale

so I do

h <- hist(x, freq=FALSE, breaks=c(1:20,25,30,40,60,100,10),  
plot=FALSE)


now I plot h$density vs 1:length(h$mids) or against
h$mids[1:(length(h$mids)-1)] if I want to be honest,
but I still don't see how to effect a log vertical scale.
I can, of course, plot log(h$density), but then the number labels  
will be wrong.


You could try apply a log transform to the appropriate component of  
the "h" object and using barplot to display the results.


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help please. 2 tables, which test?

2012-03-13 Thread Greg Snow
For this case I would use a permutation test.  Start by choosing some
statistic that represents your 4 students across the different grades,
some possibilities would be the sum of scores across grades and
students, or mean, or median, or ...

Compute the selected statistic for your 4 students and save that
value.  Now select 4 students at random and compute the same
statistic, repeat this a bunch of times (thousands) and compute the
statistic each time.  All those stats on the random selections
represent the distribution of the statistic under the null hypothesis
that your 4 students were randomly chosen (vs. chosen based on
something that is related to the grade).  Now you just compare the
stat on the original 4 students to the distribution (if you need a
specific p-value it is just the proportion of the random stats that
are as or more extreme as your original 4).

On Sat, Mar 10, 2012 at 4:04 AM, aoife doherty  wrote:
> Thank you for the replies.
> So what my test wants to do is this:
>
> I have a big matrix, 30 rows (students in a class) X 50 columns (students
> grades for the year).
> An example of the matrix is as such:
>
>
>                     grade1       grade2        grade3     .  grade 50
> student 1
> student 2***
> student 3
> student 4***
> student 5***
> student 6
> .
> .
> .
> .
> .
> student 30***
>
> As you can see, four students (students 2,4,5 and 30) have stars beside
> their name. I have chosen these students based on a particular
> characteristic that they all share.I then pulled these students out to make
> a new table:
>
>                     grade1          grade2         grade3 ... grade 50
>
> student 2
> student 4
> student 5
> student 30
>
>
> and what i want to see is basically is there any difference between the
> grades this particular set of students(i.e. student 2,4,5 and 30) got, and
> the class as a whole?
>
> So my null hypothesis is that there is no difference between this set of
> students grades, and what you would expect from the class as a whole.
>
> Aaral
>
>
>
>
>
>
> On Sat, Mar 10, 2012 at 12:18 AM, Greg Snow <538...@gmail.com> wrote:
>>
>> Just what null hypothesis are you trying to test or what question are
>> you trying to answer by comparing 2 matrices of different size?
>>
>> I think you need to figure out what your real question is before
>> worrying about which test might work on it.
>>
>> Trying to get your data to fit a given test rather than finding the
>> appropriate test or other procedure to answer your question is like
>> buying a new suit then having plastic surgery to make you fit the suit
>> rather than having the tailor modify the suit to fit you.
>>
>> If you can give us more information about what your question is we
>> have a better chance of actually helping you.
>>
>> On Fri, Mar 9, 2012 at 9:46 AM, aoife doherty 
>> wrote:
>> >
>> > Thank you. Can the chi-squared test compare two matrices that are not
>> > the
>> > same size, eg if matrix 1 is a 2 X 4 table, and matrix 2 is a 3 X 5
>> > matrix?
>> >
>> >
>> >
>> > On Fri, Mar 9, 2012 at 4:37 PM, Greg Snow <538...@gmail.com> wrote:
>> >>
>> >> The chi-squared test is one option (and seems reasonable to me if it
>> >> the the proportions/patterns that you want to test).  One way to do
>> >> the test is to combine your 2 matrices into a 3 dimensional array (the
>> >> abind package may help here) and test using the loglin function.
>> >>
>> >> On Thu, Mar 8, 2012 at 5:46 AM, aaral singh 
>> >> wrote:
>> >> > Hi.Please help if someone can.
>> >> >
>> >> > Problem:
>> >> > I have 2 matrices
>> >> >
>> >> > Eg
>> >> >
>> >> > matrix 1:
>> >> >                Freq  None  Some
>> >> >  Heavy    3        2          5
>> >> >  Never    8       13         8
>> >> >  Occas    1        4          4
>> >> >  Regul     9        5         7
>> >> >
>> >> > matrix 2:
>> >> >                  Freq     None     Some
>> >> >  Heavy        7          1             3
>> >> >  Never      87         18          84
>> >> >  Occas      12           3            4
>> >> >  Regul        9            1            7
>> >> >
>> >> >
>> >> > I want to see if matrix 1 is significantly different from matrix 2. I
>> >> > consider using a chi-squared test. Is this appropriate?
>> >> > Could anyone advise?
>> >> > Many thank you.
>> >> > Aaral Singh
>> >> >
>> >> > --
>> >> > View this message in context:
>> >> >
>> >> > http://r.789695.n4.nabble.com/help-please-2-tables-which-test-tp4456312p4456312.html
>> >> > Sent from the R help mailing list archive at Nabble.com.
>> >> >
>> >> > __
>> >> > R-help@r-project.org mailing list
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide
>> >> > http://www.R-project.org/posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >>
>> >>
>> >> --
>> >> Gregory (Greg) L. Snow Ph.D.
>> >> 538...@gmail.com
>> >>
>> >> 

Re: [R] 3D Black-Scholes Graph Help!

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 4:49 PM, Berend Hasselman wrote:



On 13-03-2012, at 21:40, David Winsemius wrote:



On Mar 13, 2012, at 4:24 PM, Anna Dunietz wrote:


Hello all!

I would like to create a 3d plot, with the option price explained  
by the underlying price and time.  Unfortunately, I can't quite  
get it to work.  I would very much appreciate your help!




The usual problem with lattice calls that you "can't quite get ...  
to work" is failure to read the R-FAQ ( or the help(lattice) page)  
where it is explained that you need to print() the function result.  
At the moment my R-machine is tied up with a long process but try  
wrapping print() around that wireframe call.



There is an error message in the actual plot.

Error using packet 1.
NAs not allowed in subscripted assignments.


When I got around to running it I was hampered by a lack of knowledge  
about what sort of data-object "price" might have been. I tried  
putting in a single number on hte theory that it would saitisfy the  
seq() call, and also  got the error you report. More input is needed  
from the OP about the problem specification, and hopefully she will  
provide a test dataset.


--
David.




Berend



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multi-histogram plotting

2012-03-13 Thread Sam Steingold
I have a vector x:
table(x)

2 3 4 5 6 7 8 91011121314 
45547 11835  4692  2241  1386   820   593   425   298   239   176   158   115 
   15161718192021222324252627 
   94887667474640203022203314 
   28293031323334353637383940 
   2010121011 8 9 8 9 9 8 7 4 
   41424344454647484950515253 
5 4 6 5 2 5 4 4 3 1 6 4 3 
   54555657585960616364656667 
2 2 1 4 5 2 5 1 3 2 1 1 4 
   71727578798283848688909293 
2 3 1 2 2 2 2 1 2 2 1 1 1 
   949596979899   100   106   109   110   111   112   119 
1 2 1 1 3 1 1 3 1 2 2 1 1 
  122   125   126   128   132   133   135   140   143   147   148   157   162 
1 1 1 1 1 1 1 1 1 1 1 1 1 
  165   166   167   169   174   176   193   197   201   208   224   236   339 
1 1 1 1 1 1 1 1 1 1 1 1 1 
  350   390   391   410   421   447   450   453   479   512   608   679   754 
1 1 1 1 1 1 1 1 1 1 1 1 1 
  774   788   956   961  9597 14821 
1 1 1 1 1 1 

I want to plot its histogram; moreover,
I want to plot histograms of its several subsets on the same device
(plot+lines+lines+... - right?)
This means that I need a freq=FALSE plots.
Now, a simple hist(x) produces a useless plot (vertical bar + horizontal
bar, which could be inferred from the table above).
So, I want
1. a logarithmic vertical scale
2. an "interesting" horizontal scale

so I do

h <- hist(x, freq=FALSE, breaks=c(1:20,25,30,40,60,100,10), plot=FALSE)

now I plot h$density vs 1:length(h$mids) or against
h$mids[1:(length(h$mids)-1)] if I want to be honest,
but I still don't see how to effect a log vertical scale.
I can, of course, plot log(h$density), but then the number labels will be wrong.

thanks!

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000
http://www.childpsy.net/ http://iris.org.il http://www.memritv.org
http://dhimmi.com http://www.PetitionOnline.com/tap12009/
Procrastinate later.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interface or Select menu

2012-03-13 Thread Marcio Pupin Mello

For Windows you can use winMenuAdd function. Type ?winAddMenu to see how...
Good luck,

Marcio
www.dsr.inpe.br/~mello


On 3/6/08 10:35 AM, Alberto Monteiro wrote:


er MIMI&  piki PIKINHA wrote:


Hello, I´m spanish student, and I´m making the finish project of
computer science. I´m working in R and I need create a Interface
which allow me select diferents execution options (similar to a menu)
. Is posible to create this menu,or interface, with R? or I have
create this interface with other language.

Thank you very much, and I hope that you understand my english.


You mean like a GUI? There are many GUI packages in R, probably
the simpler is the tcltk package.

Alberto Monteiro

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MANOVA and Extra Sums-of-Squares Tests

2012-03-13 Thread John Fox
Dear chris33,

You can use the anova() function to compare the two multivariate linear models. 
Alternatively, the Anova() function in the car package will compute "type II" 
or "type III" MANOVA tests, which aren't quite what you're asking about.

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Tue, 13 Mar 2012 12:31:25 -0700 (PDT)
 chris33  wrote:
> I would like to conduct an extra sum-of -squares test that compares a full
> MANOVA model (with all 1st order interactions) to a reduced model (no
> interactions) to determine if I can drop all interactions at the same time. 
> This is analagous to an extra sum-of-squares F-test in ANOVA, but instead
> using MANOVA.  Is there a command in R that does this?  If not, is there a
> command that calculates residual sum-of-squares and cross-product matrices? 
> Thanks.
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/MANOVA-and-Extra-Sums-of-Squares-Tests-tp4470077p4470077.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rolling regressions with sample extended one period at a time

2012-03-13 Thread pie'
Hi,

I would like suggestions as to how to perform rolling regressions with the
window extended one period at a time. That is, an initial sample period is
passed to estimation and that very sample is then extended one period at a
time through the remaining sample. Is there a specific package?

Thnks.

P.

--
View this message in context: 
http://r.789695.n4.nabble.com/Rolling-regressions-with-sample-extended-one-period-at-a-time-tp4470316p4470316.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writing a .pdf file within a function - what do I need to return()?

2012-03-13 Thread Dgnn
I am trying to write a function that generates one PDf containing plots from
several .csv files within a directory.  When I manually execute the code it
seems to work, but not when it is a function. I think I need to return()
something, but haven't had much luck figuring out what/how.

plot.isi<-function(csv.path="~/project/csv by cell") { 
csv.files<-grep('.csv', list.files(path = csv.path, full.names=T), 
value=T)
pdf(file='plots/isi plots.pdf', width=10, height=8)
#par(mfrow=c(2,1)) #ideally 2 plots per page, but will work on details
after fx. works
for (i in 1:length(csv.files)){
raw.df<-read.csv(csv.files[i])
names(raw.df)<-c('t','isi','logic','cond')
xyplot(isi ~ t, raw.df, ylim=c(0,1500), ylab='isi', 
xlab='time', 
main=basename(csv.files[i]))
}
dev.off()
}

Thank you all for the help,

Jason Deignan



--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-a-pdf-file-within-a-function-what-do-I-need-to-return-tp4470165p4470165.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MANOVA and Extra Sums-of-Squares Tests

2012-03-13 Thread chris33
I would like to conduct an extra sum-of -squares test that compares a full
MANOVA model (with all 1st order interactions) to a reduced model (no
interactions) to determine if I can drop all interactions at the same time. 
This is analagous to an extra sum-of-squares F-test in ANOVA, but instead
using MANOVA.  Is there a command in R that does this?  If not, is there a
command that calculates residual sum-of-squares and cross-product matrices? 
Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/MANOVA-and-Extra-Sums-of-Squares-Tests-tp4470077p4470077.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D Black-Scholes Graph Help!

2012-03-13 Thread Berend Hasselman

On 13-03-2012, at 21:40, David Winsemius wrote:

> 
> On Mar 13, 2012, at 4:24 PM, Anna Dunietz wrote:
> 
>> Hello all!
>> 
>> I would like to create a 3d plot, with the option price explained by the 
>> underlying price and time.  Unfortunately, I can't quite get it to work.  I 
>> would very much appreciate your help!
>> 
> 
> The usual problem with lattice calls that you "can't quite get ... to work" 
> is failure to read the R-FAQ ( or the help(lattice) page) where it is 
> explained that you need to print() the function result. At the moment my 
> R-machine is tied up with a long process but try wrapping print() around that 
> wireframe call.
> 
There is an error message in the actual plot.

Error using packet 1.
NAs not allowed in subscripted assignments.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D Black-Scholes Graph Help!

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 4:24 PM, Anna Dunietz wrote:


Hello all!

I would like to create a 3d plot, with the option price explained by  
the underlying price and time.  Unfortunately, I can't quite get it  
to work.  I would very much appreciate your help!




The usual problem with lattice calls that you "can't quite get ... to  
work" is failure to read the R-FAQ ( or the help(lattice) page) where  
it is explained that you need to print() the function result. At the  
moment my R-machine is tied up with a long process but try wrapping  
print() around that wireframe call.


--
David.


Thanks,
Anna

# Black-Scholes Option Graph
library(lattice)


blackscholes <- function(s, k, r=.1, t=5, sigma=.9,call=TRUE) {
   #calculate call/put option
   d1 <- (log(s/k)+(r+sigma^2/2)*t)/(sigma*sqrt(t))
   d2 <- d1 - sigma * sqrt(t)
   ifelse(call==TRUE,s*pnorm(d1) - k*exp(-r*t)*pnorm(d2),k*exp(-r*t)  
* pnorm(-d2) - s*pnorm(-d1))

   }

plotbs <- function(price){
   #create
   s<-seq(0,price,len=price/2)
   k<-s

   t<-0:5
   sigma<-seq(0,0.9,by=.1)

   #expand information
   OptionPrice<-matrix(nrow=length(s),ncol=length(k))
   for(i in 1:length(s)) OptionPrice[i,]<-mapply(blackscholes,s[i],k)
   grid <- expand.grid(list(Time=t, UnderlyingPrice=s))

   #plot
   wireframe(OptionPrice~Time*UnderlyingPrice,data=grid,main="3D  
Option",
 drape=T,col.regions=heat.colors(100),scales =  
list(arrows=FALSE),)

   }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D Black-Scholes Graph Help!

2012-03-13 Thread Anna Dunietz

Hello all!

I would like to create a 3d plot, with the option price explained by  
the underlying price and time.  Unfortunately, I can't quite get it to  
work.  I would very much appreciate your help!


Thanks,
Anna

# Black-Scholes Option Graph
library(lattice)


blackscholes <- function(s, k, r=.1, t=5, sigma=.9,call=TRUE) {
#calculate call/put option
d1 <- (log(s/k)+(r+sigma^2/2)*t)/(sigma*sqrt(t))
d2 <- d1 - sigma * sqrt(t)
ifelse(call==TRUE,s*pnorm(d1) - k*exp(-r*t)*pnorm(d2),k*exp(-r*t)  
* pnorm(-d2) - s*pnorm(-d1))

}

plotbs <- function(price){
#create
s<-seq(0,price,len=price/2)
k<-s

t<-0:5
sigma<-seq(0,0.9,by=.1)

#expand information
OptionPrice<-matrix(nrow=length(s),ncol=length(k))
for(i in 1:length(s)) OptionPrice[i,]<-mapply(blackscholes,s[i],k)
grid <- expand.grid(list(Time=t, UnderlyingPrice=s))

#plot
wireframe(OptionPrice~Time*UnderlyingPrice,data=grid,main="3D  
Option",
  drape=T,col.regions=heat.colors(100),scales =  
list(arrows=FALSE),)

}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] beginner's loop issue

2012-03-13 Thread Paul Johnson
On Tue, Mar 13, 2012 at 11:27 AM, aledanda  wrote:
> Dear All,
>
> I hope you don't mind helping me with this small issue. I haven't been using
> R in years and I'm trying to fill in a matrix
> with the output of a function (I'm probably using the Matlab logic here and
> it's not working).
> Here is my code:
>
> for (i in 1:length(input)){
>  out[i,1:3] <- MyFunction(input[i,1],input[i,2], input[i,3])
>    out[i,4:6] <- MyFunction(input[i,5],input[i,7], input[i,6])
>      out[i,7:9] <- MyFunction(input[i,8],input[i,10], input[i,9])
> }
>
> 'input' is a matrix
>> dim(input)
> [1] 46 10
>
> and each raw corresponds to a different subject.
> The error I get here is
>
> /Error in out[i, 1:3] <- get.vaTer(input[i, 2], input[i, 4], input[i, 3],  :
>  object 'out' not found/

out has to exist first, as previous commenter said.

Furthermore, suggestions:

Consider making MyFunction accept a vector of 3 arguments, rather than
separate arguments.

Consider making out 3 columns, as in

out <- matrix(0, nrow=N, ncol=3)
for(i ...){
out[i,1:3] <- MyFunction(input[i,1:3])
out[i,1:3] <- MyFunction(input[i,4:6])
out[i,1:3] <- MyFunction(input[i,7:9])
}

If you could re-shape your input "thing" as a list with one element
that needs to go into MyFunction, this could get easier still:

lapply(input, MyFunction)

 or if input were an array with 3 columns, you could revise MyFuntion
to accept a 3-vector.

apply(input, 1, MyFunction)

Hardly ever in R does one need to specify inputs as you have done in
your example.
pj



-- 
Paul E. Johnson
Professor, Political Science    Assoc. Director
1541 Lilac Lane, Room 504     Center for Research Methods
University of Kansas               University of Kansas
http://pj.freefaculty.org            http://quant.ku.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] size of graphs when using multiple figures by row

2012-03-13 Thread Jean V Adams
Personally, I find it easier to fix the overall size of the "page" and 
modify the margins in line size (mar and oma rather than mai and omi), 
until I get the plots the way I want them.  You don't specify what OS 
you're using, but in windows, I would use something like

windows(h=16, w=9)
par(mfrow=c(4, 2), oma=c(2, 2, 2, 2), mar=c(2, 2, 2, 2))

I'm not sure why you're writing
mtext("testen")
for each plot when you already specify
main="test"
in the hist() function.
Did you want an overall title on the page?
If so, you could try ...

hist(islands, freq=FALSE,col="blue",main="test 1") 
hist(islands, freq=FALSE,col="blue",main="test 2") 
hist(islands, freq=FALSE,col="blue",main="test 3") 
hist(islands, freq=FALSE,col="blue",main="test 4") 
hist(islands, freq=FALSE,col="blue",main="test 5") 
hist(islands, freq=FALSE,col="blue",main="test 6") 
hist(islands, freq=FALSE,col="blue",main="test 7") 
hist(islands, freq=FALSE,col="blue",main="test 8") 
mtext("testen", outer=TRUE)

Hope this helps.

Jean



Nerak wrote on 03/13/2012 04:24:42 AM:

> Hi all,
> I have a basic question concerning graphs in R.  I?m using the par()
> function and I?m working with multiple figures by row (mfrow) but my the
> hight of my figures become compressed. I have 4 rows and 2 columns 
(because
> I want to plot 8 histograms (freq = FALSE ) on it. I know I can adapt my
> margins with for example ?oma? and ?mai? but I don?t know how to choose 
the
> size of the figure? I read something about pin. That this should go 
about
> ?the current plot dimensions, (width, height), in inches. But I don?t 
know
> which values I should use. Because I always get the remark ?plot region 
too
> large?.
> 
> I would like to have for example figures of 5,8 by 5,8 cm (is 2,3 by 2,3
> inches I think). But I don?t know how to specify this?. I'm getting lost
> with the par() function
> 
> Hard to give my data online for a reproducible example but with for 
example
> this, I have the same problems:
> 
> par(mfrow=c(4,2),oma=c(2,2,2,2), mai=c(0.6, 0.6, 0.6, 0.6))
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen")
> hist(islands, freq=FALSE,col="blue",main="test")
> mtext("testen") 
> 
> (normally, I specify my breaks, ylab and xlab, xlim and y lim)
> 
> I hope someone can help me.
> Many thanks,
> Nerak
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simulate an gradually increase of the number of subjects based on two variables

2012-03-13 Thread Paul Johnson
Suggestion below:

On Tue, Mar 13, 2012 at 1:24 PM, guillaume chaumet
 wrote:
> I omit to precise that I already try to generate data based on the mean and
> sd of two variables.
>
> x=rnorm(20,1,5)+1:20
>
> y=rnorm(20,1,7)+41:60
>
> simu<-function(x,y,n) {
>    simu=vector("list",length=n)
>
>    for(i in 1:n) {
>        x=c(x,rnorm(1,mean(x),sd(x)))
>        y=c(y,rnorm(1,mean(y),sd(y)))
>        simu[[i]]$x<-x
>        simu[[i]]$y<-y
>
>
>    }
>
>    return(simu)
> }
>
> test=simu(x,y,60)
> lapply(test, function(x) cor.test(x$x,x$y))
>
> As you could see, the correlation is disappearing with increasing N.
> Perhaps, a bootstrap with lm or cor.test could solve my problem.
>

In this case, you should consider creating the LARGEST sample first,
and then remove cases to create the smaller samples.

The problem now is that you are drawing a completely fresh sample
every time, so you are getting not only the effect of sample size, but
also that extra randomness when case 1 is replaced every time.

 I am fairly confident (80%)  that if you approach it my way, the
mystery you see will start to clarify itself.  That is, draw the big
sample with the desired characteristic, and once you understand the
sampling distribution of cor for that big sample,  you will also
understand what happens when each large sample is reduced  by a few
cases.

BTW, if you were doing this on a truly massive scale, my way would run
much faster.  Allocate memory once, then don't need to manually delete
lines, just trim down the index on the rows.  (Same data access
concept as bootstrap).

pj



-- 
Paul E. Johnson
Professor, Political Science    Assoc. Director
1541 Lilac Lane, Room 504     Center for Research Methods
University of Kansas               University of Kansas
http://pj.freefaculty.org            http://quant.ku.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using caegorical variables in package randomForest.

2012-03-13 Thread Liaw, Andy
The way to represent categorical variables is with factors.  See ?factor.  
randomForest() will handle factors appropriately, as most modeling functions in 
R.

Andy 

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of abhishek
> Sent: Tuesday, March 13, 2012 8:11 AM
> To: r-help@r-project.org
> Subject: [R] Using caegorical variables in package randomForest.
> 
> Hello,
> 
> I am sorry if there are already post that answers to this 
> question but i
> tried to find them before making this post. I did not really 
> find relevant
> posts.
> 
> I am using randomForest package for building a two class 
> classifier. There
> are categorical variables and numerical variables in my data. 
> Different
> categorical variables have different number of categories 
> from 2 to 10. I am
> not sure about how to represent the categorical data.
> For example, I am using 0 and 1 for variables that have only 
> two categories.
> But, i doubt, the program is analysing the values as 
> numerical. Do you have
> any idea how can i use the c*ategorical variables for 
> building a two class
> classifier.* I am using a factor consisting of 0 and 1 for the
> classification target.
> 
> Thank you for your ideas.
> 
> -
> abhishek
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Using-caegorical-variables-in-pa
ckage-randomForest-tp4468923p4468923.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshaping a dataset for a network

2012-03-13 Thread William Dunlap
Is the following what you want?
  > a <- c(1,2,3,4,4,4,5,5)
  > b <- c(11,7,4,9,8,3,12,4)
  > split(b, a)
  $1
  [1] 11  

  $2
  [1] 7

  $3
  [1] 4

  $4
  [1] 9 8 3 

  $5
  [1] 12  4

Note that your df<-cbind(a,b) produces a matrix, not the data.frame
that your df suggests you want.  Use df<-data.frame(a,b) to make
a data.frame.  Then you could do with(df, split(a,b)) to operate on
the a and b in the data.frame df.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf
> Of Marco Guerzoni
> Sent: Tuesday, March 13, 2012 10:51 AM
> To: r-help@r-project.org
> Subject: [R] reshaping a dataset for a network
> 
> dear all,
> apologizes for bothering with a probably stupid question but I really
> don' t know how to proceed.
> 
> I have a dataset which look like df
> 
> a <- c(1,2,3,4,4,4,5,5)
> b <- c(11,7,4,9,8,3,12,4)
> df <-cbind(a,b)
> 
> I would like to have one which looks like this:
> 
> a
> 1 11
> 2 7
> 3 4
> 4 9 8 3
> 5 12 4
> 
> a are vertex of a network, b the edges. In the data the lenght of a is
> about 5
> 
> I read several posts about reshape, reshape2, split, ldply but I
> couldn't manage to do it. The problem seems to be that the is not a real
> panel.
> 
> Any help would be really appreciated,
> my best regards
> Marco
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] filter out some gene sets

2012-03-13 Thread elodie
I am working on the following database of gene sets.


database<-GSA.read.gmt( "C:/c5.all.v2.5.symbols.gmt")


I need to filter out the gene sets that contain less than 5 genes or 
more than 200 genes. I would appreciate help with that matter.

Thanks in advance


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reshaping a dataset for a network

2012-03-13 Thread Marco Guerzoni

dear all,
apologizes for bothering with a probably stupid question but I really 
don' t know how to proceed.


I have a dataset which look like df

a <- c(1,2,3,4,4,4,5,5)
b <- c(11,7,4,9,8,3,12,4)
df <-cbind(a,b)

I would like to have one which looks like this:

a
1 11
2 7
3 4
4 9 8 3
5 12 4

a are vertex of a network, b the edges. In the data the lenght of a is 
about 5


I read several posts about reshape, reshape2, split, ldply but I 
couldn't manage to do it. The problem seems to be that the is not a real 
panel.


Any help would be really appreciated,
my best regards
Marco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-value of the pooled Z score

2012-03-13 Thread Thomas Lumley
On Wed, Mar 14, 2012 at 3:41 AM, cheba meier  wrote:
> Hello,
>
> I have to compute the pooled z-value and I would like to know which way is
> more appropriate
>
>
> b <- c( -0.205,1.040,0.087)
> s <- c(0.449,0.167,0.241)
> n <- c(310, 342, 348)
> z <- b/s
>
> Z <- sum(z)/sqrt(length(n))
> P <- 2*(1-pnorm(abs(Z)))
> P
>
> w <- sqrt(n)
> Zw <- sum(w * z)/sqrt(sum(w^2))
> Pw <- 1 - pchisq(Zw * Zw, 1)
> Pw
>

A.  Both give a valid test, neither test dominates the other, so in an
abstract statistical sense both are equally correct

B.  But you probably want the weighted z, because that is more
commonly used.  It's more commonly used because it has better power
when the difference being tested is approximately the same in each of
the component z scores

C. But in that case you would probably be better off with the
precision-weighted test, using the standard error information in
weighting.

  precision <- 1/(s*s)
  m.ave <- sum(m * precision)/sum(precision)
  prec.ave <- sum(precision)
  chisq.ave <- m.ave*m.ave*prec.ave
  1 - pchisq(chisq.ave,1)

D.  But in this example the sample sizes and standard errors are
sufficiently similar that it will not make much difference.


   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Idea/package to "linearize a curve" along the diagonal?

2012-03-13 Thread Emmanuel Levy
Dear David and Jeff,

> Only if you were going apply some sort of transformation that did not extend 
> globally

Exactly, this is why the LPCM package is great, as it assigns points
to parts of a curve.

I think I pretty much got what I need - it is not perfect yet but it
should be enough to give you an idea of what I was trying to achieve.

All the best,

Emmanuel


### Example ###
library(LPCM)
tmp=rnorm(2000)
X.1 = 5+tmp
Y.1 = 5+ (5*tmp+rnorm(2000))
tmp=rnorm(1000)
X.2 = 9+tmp
Y.2 = 40+ (1.5*tmp+rnorm(1000))
X.3 = 7+ 0.5*runif(500)
Y.3 = 15+20*runif(500)
Y = c(X.1,X.2,X.3)
X = c(Y.1,Y.2,Y.3)

lpc1 = lpc(cbind(X,Y), scaled=FALSE, h=c(1,1) , control=lpc.control(
boundary=1))
my.proj = lpc.spline(lpc1, project=TRUE, optimize=TRUE)

data = cbind( dist= my.proj$closest.pi, X1=lpc1$data[,1],
Y1=lpc1$data[,2], Xo=my.proj$closest.coords[,1],
Yo=my.proj$closest.coord[,2])
transfoData = matrix(apply(data, 1, function(x) { return( transfo(
(5+x[1])/10,x[2],x[3],x[4],x[5]))}), ncol=2, byrow=TRUE)

plot(transfoData)  ## This shows the result I'm looking for, not
perfect yet but it gives an idea.

###
### Moves a point from it's position to the new "normalized" position
###
transfo = function(dist, X1, Y1, X0, Y0) {
   # First, the point needs to be rotated
   trans=newCoord(X1, Y1, X0, Y0) ;
   Xnew=X1+trans[1]
   Ynew=Y1+trans[2]

   # second it is taken on the diagonal.
   Xfinal=dist
   Yfinal=dist
   X.TransToDiag = Xfinal-X0
   Y.TransToDiag = Yfinal-Y0
   return( c(Xnew+X.TransToDiag, Ynew+Y.TransToDiag))
}

## Rotates a point X1,Y1 relative to Xo,Yo
## The new point is either at 3pi/4 or 7pi/4 i.e., 90 degrees left or
## right of the diagonal.
##
newCoord = function(X1,Y1, Xo=0, Yo=0){

   # First calculates the coordinates of the point relative to Xo,Yo
   Xr = X1-Xo
   Yr = Y1-Yo

   # Now calculates the new coordinates,
   # i.e.,
   # if V is the vector defined from Xo,Yo to X1,Y1,
   # the new coordinates are such that Xf, Yf are at angle TETA
   # by default TETA=3*pi/4 or 135 degrees
   To = atan2(Yr,Xr)

   # XXX This is not perfect but will do the job for now
   if(Yr > Xr){
   TETA=3*pi/4
   } else {
   TETA=7*pi/4
   }
   Xn = Xr * (cos(TETA)/cos(To))
   Yn = Yr * (sin(TETA)/sin(To))

   # Xn, Yn are the new coordinates relative to Xo, Yo
   # However for the translation I need absolute coordinates
   # These are given by Xo + Xn and Y0 + Yn
   Xabs = Xo+Xn
   Yabs = Yo+Yn

   ## the translation that need to be applied to X1 and Y1 are thus:
   Xtrans = Xabs-X1
   Ytrans = Yabs-Y1
   return(c(Xtrans,Ytrans))
}







On 12 March 2012 20:58, David Winsemius  wrote:
>
> On Mar 12, 2012, at 3:07 PM, Emmanuel Levy wrote:
>
>> Hi Jeff,
>>
>> Thanks for your reply and the example.
>>
>> I'm not sure if it could be applied to the problem I'm facing though,
>> for two reasons:
>>
>> (i) my understanding is that the inverse will associate a new Y
>> coordinate given an absolute X coordinate. However, in the case I'm
>> working on, the transformation that has to be applied depends on X
>> *and* on its position relative to the *normal* of the fitted curve.
>> This means, for instance, that both X and Y will change after
>> transformation.
>>
>> (ii) the fitted curve can be described by a spline, but I'm not sure
>> if inverse of such models can be inferred automatically (I don't know
>> anything about that).
>>
>> The procedure I envision is the following: treat the curve "segment by
>> segment", apply rotation+translation to each segment to bring it on
>> the
>> diagonal,
>
>
> That makes sense. Although the way I am imagining it would be to do a
> condition (on x) shift.
>
>
>> and apply the same transformation to all points
>> corresponding to the same segment (i.e., these are the points that are
>> close and within the "normal" area covered by the segment).
>>
>> Does this make sense?
>
>
> The first part sort of makes sense to me... maybe. You are thinking of some
> sort of local transformation that converts a curve to a straight line by
> rotation or deformation. Seems like a problem of finding a transformation of
> a scalar field. But you then want it extended outward to affect the regions
> at some distance from the curve. That's where might break down or at least
> becomes non-trivial. Because a region at a distance could be "in the sights"
> of the "normal" vector to the curve (not in the statistical sense but in the
> vector-field sense) of more than one segment of the curve. Only if you were
> going apply some sort of transformation that did not extend globally would
> you be able to do anything other than a y|x (y conditional on x) shift or
> expansion contraction
>
>> All the best,
>>
>> Emmanuel
>>
>>
>> On 12 March 2012 02:15, Jeff Newmiller  wrote:
>>>
>>> It is possible that I do not see what you mean, but it seems like the
>>> follo

Re: [R] simulate an gradually increase of the number of subjects based on two variables

2012-03-13 Thread guillaume chaumet
I omit to precise that I already try to generate data based on the mean and
sd of two variables.

x=rnorm(20,1,5)+1:20

y=rnorm(20,1,7)+41:60

simu<-function(x,y,n) {
simu=vector("list",length=n)

for(i in 1:n) {

x=c(x,rnorm(1,mean(x),sd(x)))
y=c(y,rnorm(1,mean(y),sd(y)))

simu[[i]]$x<-x
simu[[i]]$y<-y


}

return(simu)
}

test=simu(x,y,60)
lapply(test, function(x) cor.test(x$x,x$y))

As you could see, the correlation is disappearing with increasing N.
Perhaps, a bootstrap with lm or cor.test could solve my problem.



2012/3/13 guillaume chaumet 

> Dear R list,
> I have a population with two groups. I want to simulate an gradually
> increase of the number of subjects for group 1 based on mean and sd of two
> variables (correlated).
> Bootstrap ?
> Sample ?
> Simulation ? (
>
> I just search some clues.
> Thank you
>
> Guillaume
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ess-tracebug to open a file

2012-03-13 Thread Feng Li

Dear all,

First I would like to thank the ESS people's all the hard work. I am 
watching the project closely and witnessing the improvements day by day.


Besides I found a strange situation using `ess-tracebug'. Please tell me 
if I am wrong or this is a bug.


Start Emacs with "emacs -Q" and load ESS and enable ess-tracebug.

Create a file named `testFun1.R' with the following contents

testFun1<-function(x)
  {
y <- x+2
browser()
return(y)
  }

testFun1(2)


Then start R under the same directory and run the command

source("testFun1.R")


which will show the following information

source("testFun1.R")

Called from: testFun1(2)
Browse[1]> debug at testFun1.R#5: return(y)


Then if I left click `testFun1.R#5' I get this error

Wrong type argument: listp, [cl-struct-compilation--message (nil 5 
(("testFun1.R" nil) nil (5 #1)) nil nil) 2 nil]


Here are some system information might be useful

Emacs version

GNU Emacs 24.0.94.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.2.3) of 
2012-02-27 on nova


ESS version

SVN trunk@4690


Linux version

3.2.0-2-amd64 (Debian 3.2.9-1) (debian-ker...@lists.debian.org) (gcc version 
4.6.3 (Debian 4.6.3-1) ) #1 SMP Sun Mar 4 22:48:17 UTC 2012



Best regards,

Feng

--
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting factor data into Date-time format

2012-03-13 Thread R. Michael Weylandt
No problem.

A pro-tip for future posts: the dput() function creates a plain text
representation of the data in question which is great for email and is
nicely copy-and-pasteable. It wasn't so much a thing here, but for
large or complicated data sets, the regular console printout doesn't
always reveal all the details in play. (This can be particularly nasty
when dealing with time objects)

Best,

Michael Weylandt


On Tue, Mar 13, 2012 at 1:28 PM, Haojie Yan  wrote:
> Hi Michael!!!
>
> As a first time R-help user I just wanted to say THANKS A MILLION!!! for
> your prompt and very helpful reply!!!
>
> Still can not believe this annoying issue can be resolved that quickly!!!
>
> Brilliant-will visit this forum soon!!
> HJ
>
> On Tue, Mar 13, 2012 at 5:09 PM, R. Michael Weylandt
>  wrote:
>>
>> Just a little typo: see below.
>>
>> On Tue, Mar 13, 2012 at 1:00 PM, Haojie Yan  wrote:
>> > Dear Michael,
>> >
>> > Thanks a lot for your hints.
>> >
>> > I have just had a try as below but still got back some error messages as
>> > shown:
>> >
>> > The object containing the 'date_time' data is named
>> > 'INTERVAL_END_TIME' and
>> > and wanted to plot it against another variable 'CHANNEL_01' (data type:
>> > numerical).
>> >
>> >
>> > As you suggested I did the following...
>> >
>> > (1) Converting 'INTERVAL_END_TIME' into charactor first and name it
>> > 'DATE_TIME_1';
>> > (2) Converting  'DATE_TIME_1' into "POSIXlt" "POSIXt" type data
>> >
>> > But, seems it doesnt work well and I got 'NA's for all of them...
>> >
>> > Any thoughts??
>> >
>> > Many thanks again!
>> > HJ
>> >
>> >
>> >> INTERVAL_END_TIME[1:10]
>> >  [1] 20/02/2012 00:10 20/02/2012 00:20 20/02/2012 00:30 20/02/2012 00:40
>> > 20/02/2012 00:50
>> >  [6] 20/02/2012 01:00 20/02/2012 01:10 20/02/2012 01:20 20/02/2012 01:30
>> > 20/02/2012 01:40
>> > 1584 Levels: 01/03/2012 00:00 01/03/2012 00:10 01/03/2012 00:20 ...
>> > 29/02/2012 23:50
>> >
>> >> class(INTERVAL_END_TIME)
>> > [1] "factor"
>> >
>> >> DATE_TIME_1<-as.character(INTERVAL_END_TIME)
>> >
>> >> DATE_TIME_1[1:10]
>> >  [1] "20/02/2012 00:10" "20/02/2012 00:20" "20/02/2012 00:30"
>> > "20/02/2012
>> > 00:40"
>> >  [5] "20/02/2012 00:50" "20/02/2012 01:00" "20/02/2012 01:10"
>> > "20/02/2012
>> > 01:20"
>> >  [9] "20/02/2012 01:30" "20/02/2012 01:40"
>> >
>> >> class(DATE_TIME_1)
>> > [1] "character"
>> >
>> >> DATE_TIME_2<-as.POSIXlt(DATE_TIME_1,format="%d/%m/%y %H:%M")
>>
>> This needs to be a capital "Y" as in my original post.
>>
>> e.g.,
>>
>> x <- c("20/02/2012 00:10", "20/02/2012 00:20", "20/02/2012 00:30",
>> "20/02/2012 00:40")
>>
>> as.POSIXct(x, format = "%d/%m/%y %H:%M") # No good :-(
>> as.POSIXct(x, format = "%d/%m/%Y %H:%M") # Good!
>>
>>
>>
>> Michael
>>
>> >
>> >> DATE_TIME_2[1:10]
>> >  [1] NA NA NA NA NA NA NA NA NA NA
>> >
>> >> class(DATE_TIME_2)
>> > [1] "POSIXlt" "POSIXt"
>> >
>> >> plot(DATE_TIME_2,CHANNEL_01)
>> > Error in plot.window(...) : need finite 'xlim' values
>> > In addition: Warning messages:
>> > 1: In min(x) : no non-missing arguments to min; returning Inf
>> > 2: In max(x) : no non-missing arguments to max; returning -Inf
>> >>
>> >
>> > On Tue, Mar 13, 2012 at 4:30 PM, R. Michael Weylandt
>> >  wrote:
>> >>
>> >> as.POSIXct(as.character(FACTORHERE), format = "%d/%m/%Y %H:%M")
>> >>
>> >> Michael
>> >>
>> >> On Tue, Mar 13, 2012 at 12:20 PM, Haojie Yan 
>> >> wrote:
>> >> > Dear R-user,
>> >> >
>> >> > I have read a dataset from .csv file into R. This dataset includes
>> >> > one
>> >> > column containing some data in 'date and time' format, e.g.
>> >> > 'dd/mm/
>> >> > hh:mm'.
>> >> >
>> >> > These data were automatically read and saved as 'factor' in R. When I
>> >> > was
>> >> > trying to produce some plots (such as time series) with the above
>> >> > 'date
>> >> > and
>> >> > time' on x-axis,  it caused some disodering problem, e.g. 1st of
>> >> > March
>> >> > 2012
>> >> > is in front of 10th of Feb. 2012 (if the data is from 10th Feb. 2012
>> >> > to
>> >> > 1st
>> >> > of March 2012). I understand that I might have to convert them from
>> >> > 'factor' to 'date' first, so I tried using 'as.date'. But this method
>> >> > seems
>> >> > only work for data in format of  'd/m/y' and no further option that
>> >> > allows
>> >> > me to add hours and minutes.
>> >> >
>> >> > I checked online for other methods such as 'as.POSIX' and 'strptime'
>> >> > but
>> >> > none of them seem to offer me a quick solution.
>> >> >
>> >> > Please note that the data I received is recorded every 10 minutes so
>> >> > they
>> >> > are saved in the form of  'dd/mm/ hh:mm', e.g. I only have data
>> >> > measured up to 'minute' NOT to  'second'. Are there any direct
>> >> > solution
>> >> > that I can solve this issue??
>> >> >
>> >> >
>> >> > Many thanks in advance!
>> >> > HJ
>> >> >
>> >> >        [[alternative HTML version deleted]]
>> >> >
>> >>
>> >> > __
>> >> > R-help@r-project.org mailing list
>> >> > https://stat

[R] Error : package is not installed for 'arch=x64'

2012-03-13 Thread Li, Yan
HI All,

I got the error : package  is not installed for 'arch=x64' when building my own 
package for 64bit R. How can I configure the arch ? The 'R CMD config' does not 
work. Thank you very much!

The detailed error is :

** testing if installed package can be loaded
Error : package 'xxx' is not installed for 'arch=x64'
Error: loading failed
Execution halted
ERROR: loading failed

Best,
Yan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reproducible sample() in mclapply

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 12:51 PM, Lik Wee Lee wrote:


Hi,

Using the multicore package and calling sample() in mclapply,
how do I get the results to be reproducible?
I know the random number is seeded by process id and so is
different for each run.


You might want to look up the thread in last month's section of the R- 
devel mailing list Archives entitled "portable parallel seeds project:  
request for critiques""


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reproducible sample() in mclapply

2012-03-13 Thread Berend Hasselman

On 13-03-2012, at 17:51, Lik Wee Lee wrote:

> Hi,
> 
> Using the multicore package and calling sample() in mclapply,
> how do I get the results to be reproducible?
> I know the random number is seeded by process id and so is
> different for each run.

Don't know about those two packages.
Have a look at

?set.seed

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what does "rlm" do if it fails to converge within iteration limits?

2012-03-13 Thread Martin Maechler
> Michael  
> on Mon, 12 Mar 2012 13:19:19 -0500 writes:

> The problem is: by default shouldn't it use "Huber's"?
> And it should be convex problem no?

> so when I do rlm(y~x) which is a single-beta fitting problem,

> shouldn't it always converge?

“In theory, theory and practice are the same. In practice, they are not.”
― Albert Einstein
 [according to http://www.goodreads.com/quotes/show/66864 ]

Theory says that convergence happens in an infinite number of
iterations, but then theory also says convergence means that the
coefficients don't change any more ;-)

[... etc.]

> 

> Psi functions are supplied for the Huber, Hampel and Tukey bisquare
> proposals as psi.huber, psi.hampel and psi.bisquare. Huber's corresponds 
to
> a convex optimization problem and gives a unique solution (up to
> collinearity). The other two will have multiple local minima, and a good
> starting point is desirable.

which also mentions "up to collinearity" (theory).
"Practice" would add "near-collinearity" and many other
practical border line issues that can happen. 

As maintainer of the  robustbase  package,
I'm slightly intrigued by your example.
==> 
1) Can you provide it reproducible  dput(data)  "cut & paste"
   into your e-mail if small;  available as mydata.rda after
   save(., file="mydata.rda")  for download

2) What's the result of using  lmrob()  {package 'robustbase'}
   instead of  rlm()  {package 'MASS'} ?

Best regards,
Martin Maechler, ETH Zurich


> On Fri, Mar 9, 2012 at 1:21 PM, Berend Hasselman  wrote:

>> 
>> On 09-03-2012, at 20:00, Michael wrote:
>> 
>> > Hi all,
>> >
>> > In using "rlm" I've got a bunch of warnings... "failed to converge in 
20
>> > steps", etc.
>> >
>> > My question is:
>> >
>> > what are the results then after the failure?
>> >
>> 
>> They haven't converged. So inaccurate. Maybe your model is badly
>> formulated or ill conditioned.
>> 
>> > Will "rlm" automatically downgrade back to "lm" upon failure?
>> >
>> Help says nothing about that so most likely no.
>> 
>> Why don't you try and raise maxit? Use maxit=40 in the call of rlm. And
>> see what happens.
>> 
>> Berend
>> 
>> 
>> 

> [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error " subscript out of bounds"

2012-03-13 Thread Houhou Li
Thank you very much David. I should realize myslef that operator precedence 
cause the problem. Sorry about this:-) I solved the problem.

--- On Tue, 3/13/12, David Winsemius  wrote:


From: David Winsemius 
Subject: Re: [R] Error " subscript out of bounds"
To: "Houhou Li" 
Cc: r-help@r-project.org
Date: Tuesday, March 13, 2012, 9:45 AM



On Mar 13, 2012, at 12:18 PM, Houhou Li wrote:

> Hello, R-users,
> 
> I have a datafile with 37313 records and each record has 5 different 
> measurements on the same variables. The format looks like this: treeID, VIG0, 
> VIG1, VIG2, VIG3, VIG4
> I was trying to convert the one row record to 5 rows record with format like 
> this (treeID, MEASUREMENT, VIGOR). My code like this:
> 
> treeMeas<-matrix(data=0,nrow=(length(tree1$indivTree)*5), ncol=3)
> colnames(treeMeas)<-c("indivTree", "meas", "vigor")
> for(i in 1:length(tree1$indivTree))
> {
>   treeMeas[(i-1)*5+1:(i*5),1]<-tree1$indivTree[i]

You need to review operator precedence (and probably the R-FAQ where I know 
that this is also reviewed):

(i-1)*5+1:(i*5)  parses as ( 1-1*5) added to 1:(i*5)

> i=37313; length( (i-1)*5+1:(i*5))
[1] 186565

>   treeMeas[(i-1)*5+1:(i*5),2]<-c(0:4)
>   treeMeas[(i-1)*5+1:(i*5),3]<-c(tree1$VIG0[i], tree1$VIG1[i], tree1$VIG2[i], 
>tree1$VIG3[i], tree1$VIG4[i])

Wouldn't this be a whole lot easier with 'reshape' (the base function)? 0r 
'melt' from either reshape package or reshape2 package?

--David

>   }
> 
> When I run the code, I always got error message like this " Error in 
> treeMeas[(i - 1) * 5 + 1:(i * 5), 1] <- tree1$indivTree[i] :   subscript out 
> of bounds". I couldn't figure out why subscript out of bounds. Is this 
> because the matrix is too big (186565 by 3)? Any one can help? Thank you very 
> much.
> 
> Yuzhen
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting factor data into Date-time format

2012-03-13 Thread R. Michael Weylandt
Just a little typo: see below.

On Tue, Mar 13, 2012 at 1:00 PM, Haojie Yan  wrote:
> Dear Michael,
>
> Thanks a lot for your hints.
>
> I have just had a try as below but still got back some error messages as
> shown:
>
> The object containing the 'date_time' data is named 'INTERVAL_END_TIME' and
> and wanted to plot it against another variable 'CHANNEL_01' (data type:
> numerical).
>
>
> As you suggested I did the following...
>
> (1) Converting 'INTERVAL_END_TIME' into charactor first and name it
> 'DATE_TIME_1';
> (2) Converting  'DATE_TIME_1' into "POSIXlt" "POSIXt" type data
>
> But, seems it doesnt work well and I got 'NA's for all of them...
>
> Any thoughts??
>
> Many thanks again!
> HJ
>
>
>> INTERVAL_END_TIME[1:10]
>  [1] 20/02/2012 00:10 20/02/2012 00:20 20/02/2012 00:30 20/02/2012 00:40
> 20/02/2012 00:50
>  [6] 20/02/2012 01:00 20/02/2012 01:10 20/02/2012 01:20 20/02/2012 01:30
> 20/02/2012 01:40
> 1584 Levels: 01/03/2012 00:00 01/03/2012 00:10 01/03/2012 00:20 ...
> 29/02/2012 23:50
>
>> class(INTERVAL_END_TIME)
> [1] "factor"
>
>> DATE_TIME_1<-as.character(INTERVAL_END_TIME)
>
>> DATE_TIME_1[1:10]
>  [1] "20/02/2012 00:10" "20/02/2012 00:20" "20/02/2012 00:30" "20/02/2012
> 00:40"
>  [5] "20/02/2012 00:50" "20/02/2012 01:00" "20/02/2012 01:10" "20/02/2012
> 01:20"
>  [9] "20/02/2012 01:30" "20/02/2012 01:40"
>
>> class(DATE_TIME_1)
> [1] "character"
>
>> DATE_TIME_2<-as.POSIXlt(DATE_TIME_1,format="%d/%m/%y %H:%M")

This needs to be a capital "Y" as in my original post.

e.g.,

x <- c("20/02/2012 00:10", "20/02/2012 00:20", "20/02/2012 00:30",
"20/02/2012 00:40")

as.POSIXct(x, format = "%d/%m/%y %H:%M") # No good :-(
as.POSIXct(x, format = "%d/%m/%Y %H:%M") # Good!



Michael

>
>> DATE_TIME_2[1:10]
>  [1] NA NA NA NA NA NA NA NA NA NA
>
>> class(DATE_TIME_2)
> [1] "POSIXlt" "POSIXt"
>
>> plot(DATE_TIME_2,CHANNEL_01)
> Error in plot.window(...) : need finite 'xlim' values
> In addition: Warning messages:
> 1: In min(x) : no non-missing arguments to min; returning Inf
> 2: In max(x) : no non-missing arguments to max; returning -Inf
>>
>
> On Tue, Mar 13, 2012 at 4:30 PM, R. Michael Weylandt
>  wrote:
>>
>> as.POSIXct(as.character(FACTORHERE), format = "%d/%m/%Y %H:%M")
>>
>> Michael
>>
>> On Tue, Mar 13, 2012 at 12:20 PM, Haojie Yan 
>> wrote:
>> > Dear R-user,
>> >
>> > I have read a dataset from .csv file into R. This dataset includes one
>> > column containing some data in 'date and time' format, e.g. 'dd/mm/
>> > hh:mm'.
>> >
>> > These data were automatically read and saved as 'factor' in R. When I
>> > was
>> > trying to produce some plots (such as time series) with the above 'date
>> > and
>> > time' on x-axis,  it caused some disodering problem, e.g. 1st of March
>> > 2012
>> > is in front of 10th of Feb. 2012 (if the data is from 10th Feb. 2012 to
>> > 1st
>> > of March 2012). I understand that I might have to convert them from
>> > 'factor' to 'date' first, so I tried using 'as.date'. But this method
>> > seems
>> > only work for data in format of  'd/m/y' and no further option that
>> > allows
>> > me to add hours and minutes.
>> >
>> > I checked online for other methods such as 'as.POSIX' and 'strptime' but
>> > none of them seem to offer me a quick solution.
>> >
>> > Please note that the data I received is recorded every 10 minutes so
>> > they
>> > are saved in the form of  'dd/mm/ hh:mm', e.g. I only have data
>> > measured up to 'minute' NOT to  'second'. Are there any direct solution
>> > that I can solve this issue??
>> >
>> >
>> > Many thanks in advance!
>> > HJ
>> >
>> >        [[alternative HTML version deleted]]
>> >
>>
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] beginner's loop issue

2012-03-13 Thread R. Michael Weylandt
Yes, the short answer is that you need to define out before running
the loop. The most effective way to do so will be to set up a matrix
with the exact right dimensions (if you know them up front); something
like out <- matrix(NA, nrow = length(input), ncol = 9)

Michael



On Tue, Mar 13, 2012 at 12:27 PM, aledanda  wrote:
> Dear All,
>
> I hope you don't mind helping me with this small issue. I haven't been using
> R in years and I'm trying to fill in a matrix
> with the output of a function (I'm probably using the Matlab logic here and
> it's not working).
> Here is my code:
>
> for (i in 1:length(input)){
>  out[i,1:3] <- MyFunction(input[i,1],input[i,2], input[i,3])
>    out[i,4:6] <- MyFunction(input[i,5],input[i,7], input[i,6])
>      out[i,7:9] <- MyFunction(input[i,8],input[i,10], input[i,9])
> }
>
> 'input' is a matrix
>> dim(input)
> [1] 46 10
>
> and each raw corresponds to a different subject.
> The error I get here is
>
> /Error in out[i, 1:3] <- get.vaTer(input[i, 2], input[i, 4], input[i, 3],  :
>  object 'out' not found/
>
> So I wonder, what's wrong in the assignment to the variable out?
> Should I define the variable before the loop?
>
> Thanks for your help
> Best
>
> Ale
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/beginner-s-loop-issue-tp4469514p4469514.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reproducible sample() in mclapply

2012-03-13 Thread Lik Wee Lee
Hi,

Using the multicore package and calling sample() in mclapply,
how do I get the results to be reproducible?
I know the random number is seeded by process id and so is
different for each run.

Thanks,
Lik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] beginner's loop issue

2012-03-13 Thread aledanda
Dear All,

I hope you don't mind helping me with this small issue. I haven't been using
R in years and I'm trying to fill in a matrix 
with the output of a function (I'm probably using the Matlab logic here and
it's not working). 
Here is my code:

for (i in 1:length(input)){
  out[i,1:3] <- MyFunction(input[i,1],input[i,2], input[i,3]) 
out[i,4:6] <- MyFunction(input[i,5],input[i,7], input[i,6]) 
  out[i,7:9] <- MyFunction(input[i,8],input[i,10], input[i,9]) 
}

'input' is a matrix 
> dim(input)
[1] 46 10

and each raw corresponds to a different subject. 
The error I get here is

/Error in out[i, 1:3] <- get.vaTer(input[i, 2], input[i, 4], input[i, 3],  : 
  object 'out' not found/

So I wonder, what's wrong in the assignment to the variable out? 
Should I define the variable before the loop?

Thanks for your help
Best

Ale

--
View this message in context: 
http://r.789695.n4.nabble.com/beginner-s-loop-issue-tp4469514p4469514.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting factor data into Date-time format

2012-03-13 Thread Gabor Grothendieck
On Tue, Mar 13, 2012 at 12:20 PM, Haojie Yan  wrote:
> Dear R-user,
>
> I have read a dataset from .csv file into R. This dataset includes one
> column containing some data in 'date and time' format, e.g. 'dd/mm/
> hh:mm'.
>
> These data were automatically read and saved as 'factor' in R. When I was
> trying to produce some plots (such as time series) with the above 'date and
> time' on x-axis,  it caused some disodering problem, e.g. 1st of March 2012
> is in front of 10th of Feb. 2012 (if the data is from 10th Feb. 2012 to 1st
> of March 2012). I understand that I might have to convert them from
> 'factor' to 'date' first, so I tried using 'as.date'. But this method seems
> only work for data in format of  'd/m/y' and no further option that allows
> me to add hours and minutes.
>
> I checked online for other methods such as 'as.POSIX' and 'strptime' but
> none of them seem to offer me a quick solution.
>
> Please note that the data I received is recorded every 10 minutes so they
> are saved in the form of  'dd/mm/ hh:mm', e.g. I only have data
> measured up to 'minute' NOT to  'second'. Are there any direct solution
> that I can solve this issue??

See Example 6 in the document Reading Data in zoo:
http://cran.r-project.org/package=zoo

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting factor data into Date-time format

2012-03-13 Thread Joshua Wiley
This is just a little comment to supplement Michael's excellent
solution.  If there are even a few (e.g., 5 each) repeated values,
this:

as.POSIXct(as.character(levels(x)), format = "%d/%m/%Y %H:%M")[x]

will be substantially faster, with the speed gains strongly associated
with the number of replicates in your factor (where 'x' is your factor
(note it is used twice!!).  A little timing example:

x <- factor(rep(paste("01/01/2012 10:", 0:59, sep = ''), each = 5))
t1 <- t2 <- NULL

system.time(replicate(500, {t1 <<- as.POSIXct(as.character(x), format
= "%d/%m/%Y %H:%M")}))
system.time(replicate(500, {t2 <<- as.POSIXct(as.character(levels(x)),
format = "%d/%m/%Y %H:%M")[x]}))

all.equal(t1, t2)

 on my machine
## > system.time(replicate(500, {t1 <<- as.POSIXct(as.character(x),
format = "%d/%m/%Y %H:%M")}))
##user  system elapsed
##3.010.013.03
## > system.time(replicate(500, {t2 <<-
as.POSIXct(as.character(levels(x)), format = "%d/%m/%Y %H:%M")[x]}))
##user  system elapsed
##0.670.000.67
## > all.equal(t1, t2)
## [1] TRUE

On Tue, Mar 13, 2012 at 9:30 AM, R. Michael Weylandt
 wrote:
> as.POSIXct(as.character(FACTORHERE), format = "%d/%m/%Y %H:%M")
>
> Michael
>
> On Tue, Mar 13, 2012 at 12:20 PM, Haojie Yan  wrote:
>> Dear R-user,
>>
>> I have read a dataset from .csv file into R. This dataset includes one
>> column containing some data in 'date and time' format, e.g. 'dd/mm/
>> hh:mm'.
>>
>> These data were automatically read and saved as 'factor' in R. When I was
>> trying to produce some plots (such as time series) with the above 'date and
>> time' on x-axis,  it caused some disodering problem, e.g. 1st of March 2012
>> is in front of 10th of Feb. 2012 (if the data is from 10th Feb. 2012 to 1st
>> of March 2012). I understand that I might have to convert them from
>> 'factor' to 'date' first, so I tried using 'as.date'. But this method seems
>> only work for data in format of  'd/m/y' and no further option that allows
>> me to add hours and minutes.
>>
>> I checked online for other methods such as 'as.POSIX' and 'strptime' but
>> none of them seem to offer me a quick solution.
>>
>> Please note that the data I received is recorded every 10 minutes so they
>> are saved in the form of  'dd/mm/ hh:mm', e.g. I only have data
>> measured up to 'minute' NOT to  'second'. Are there any direct solution
>> that I can solve this issue??
>>
>>
>> Many thanks in advance!
>> HJ
>>
>>        [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error " subscript out of bounds"

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 12:18 PM, Houhou Li wrote:


Hello, R-users,

I have a datafile with 37313 records and each record has 5 different  
measurements on the same variables. The format looks like this:  
treeID, VIG0, VIG1, VIG2, VIG3, VIG4
I was trying to convert the one row record to 5 rows record with  
format like this (treeID, MEASUREMENT, VIGOR). My code like this:


treeMeas<-matrix(data=0,nrow=(length(tree1$indivTree)*5), ncol=3)
colnames(treeMeas)<-c("indivTree", "meas", "vigor")
for(i in 1:length(tree1$indivTree))
{
  treeMeas[(i-1)*5+1:(i*5),1]<-tree1$indivTree[i]


You need to review operator precedence (and probably the R-FAQ where I  
know that this is also reviewed):


(i-1)*5+1:(i*5)  parses as ( 1-1*5) added to 1:(i*5)

> i=37313; length( (i-1)*5+1:(i*5))
[1] 186565


  treeMeas[(i-1)*5+1:(i*5),2]<-c(0:4)
  treeMeas[(i-1)*5+1:(i*5),3]<-c(tree1$VIG0[i], tree1$VIG1[i],  
tree1$VIG2[i], tree1$VIG3[i], tree1$VIG4[i])


Wouldn't this be a whole lot easier with 'reshape' (the base  
function)? 0r 'melt' from either reshape package or reshape2 package?


--
David


  }

When I run the code, I always got error message like this " Error in  
treeMeas[(i - 1) * 5 + 1:(i * 5), 1] <- tree1$indivTree[i] :
subscript out of bounds". I couldn't figure out why subscript out of  
bounds. Is this because the matrix is too big (186565 by 3)? Any one  
can help? Thank you very much.


Yuzhen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Faceted bar plot shows wrong counts (ggplot2)

2012-03-13 Thread Helios de Rosario
Michael,

Thanks for the pointer to the discussion in the ggplot list. It seems
that the reason of this behaviour of facet_grid() is already known and
being discussed by the developers of ggplot2.

facet_grid() reduces the original data frame with unique() before
applying the stats.  If the data frame has any other column that
prevents duplicated rows, counts are correctly computed.

E.g.

diamonds25 <- droplevels(diamonds[1:25,]) # Keep all columns

# Everything else as before:
base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
 geom_bar(position  =  "dodge")  +
 opts(legend.position  =  "none")
 base  +  aes(x  =  cut)  +
 facet_grid(.  ~  color)


Helios

>>> El día 12/03/2012 a las 20:59, "R. Michael Weylandt"
 escribió:
> You get the "good" behavior with
> 
> base + aes(x = cut) + facet_wrap(~ color, ncol = 5)
> 
> so this seems buggy to me.
> 
> If someone here doesn't step forward with more insight, I'd forward
it
> to the ggplot list to see if one of the developers there can give an
> explanation or possibly make the official call that it's a bug.
> 
> There was another report of a possible bug in facet_grid() today
that
> could be related:
>
https://groups.google.com/group/ggplot2/browse_thread/thread/5213ac35da6b36d

> 4
> 
> Michael
> 
> On Mon, Mar 12, 2012 at 7:16 AM, Helios de Rosario
>  wrote:
>> I have encountered a problem with faceted bar plots. I have tried
to
>> create something like the example explained in the ggplot2 book (see
pp.
>> 126-128):
>>
>> library(ggplot2)
>> mpg4  <-  subset(mpg,  manufacturer  %in%
>> c("audi",  "volkswagen",  "jeep"))
>> mpg4$manufacturer  <-  as.character(mpg4$manufacturer)
>> mpg4$model  <-  as.character(mpg4$model)
>>
>> base  <-  ggplot(mpg4,  aes(fill  =  model))  +
>> geom_bar(position  =  "dodge")  +
>> opts(legend.position  =  "none")
>> base  +  aes(x  =  model)  +
>> facet_grid(.  ~  manufacturer)
>>
>> That example works fine; the bar heights are just the same as the
>> counts in the table:
>>
>> table(mpg4[,1:2])
>>  model
>> manufacturer a4 a4 quattro a6 quattro grand cherokee 4wd gti jetta
new
>> beetle
>>  audi7  8  3  0   0 0
>>0
>>  jeep0  0  0  8   0 0
>>0
>>  volkswagen  0  0  0  0   5 9
>>6
>>  model
>> manufacturer passat
>>  audi0
>>  jeep0
>>
>> But in other cases this does not occur. For instance, take a small
>> subset of data(diamonds):
>>
>> diamonds25 <- droplevels(diamonds[1:25,2:3])
>> table(diamonds25)
>>   color
>> cut E F H I J
>>  Fair  1 0 0 0 0
>>  Good  1 0 0 1 4
>>  Very Good 1 0 3 1 4
>>  Premium   3 1 0 1 0
>>  Ideal 1 0 0 1 2
>>
>> And change the variables mapped in the previous plot:
>>
>> base  <-  ggplot(diamonds25,  aes(fill  =  cut))  +
>> geom_bar(position  =  "dodge")  +
>> opts(legend.position  =  "none")
>> base  +  aes(x  =  cut)  +
>> facet_grid(.  ~  color)
>>
>> I see all bars with height = 1.
>> I have ovserved this problem (wrong bar heights, but not always =
1),
>> in other cases when all counts are very small or zero.
>> What's wrong here?
>>
>> Regards,
>> Helios
>>
>> sessionInfo()
>> R version 2.14.2 (2012-02-29)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
>> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
>> [5] LC_TIME=Spanish_Spain.1252
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods  
base
>>
>> other attached packages:
>> [1] ggplot2_0.9.0
>>
>> loaded via a namespace (and not attached):
>>  [1] colorspace_1.1-1   dichromat_1.2-4digest_0.5.1
>> grid_2.14.2
>>  [5] MASS_7.3-17memoise_0.1munsell_0.3
>> plyr_1.7.1
>>  [9] proto_0.3-9.2  RColorBrewer_1.0-5 reshape2_1.2.1
>> scales_0.2.0
>> [13] stringr_0.6
>>
>>
>>
>> INSTITUTO DE BIOMECÁNICA DE VALENCIA
>> Universidad Politécnica de Valencia ● Edificio 9C
>> Camino de Vera s/n ● 46022 VALENCIA (ESPAÑA)
>> Tel. +34 96 387 91 60 ● Fax +34 96 387 91 69
>> www.ibv.org 
>>
>>  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
>> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la
Protección
>> de Datos de Carácter Personal, le informamos de que el presente
mensaje
>> contiene información confidencial, siendo para uso exclusivo del
>> destinatario arriba indicado. En caso de no ser usted el
destinatario
>> del mismo le informamos que su recepción no le autoriza a su
divulgación
>> o reproducción por cualquier medio, debiendo destruirlo de
inmediato,
>> rogándole lo notifique al remitente.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.

I

Re: [R] how to find best model of time series?

2012-03-13 Thread R. Michael Weylandt
Take a look at

example(HoltWinters)

Michael

On Tue, Mar 13, 2012 at 11:06 AM, sagarnikam123  wrote:
> i have data in one file below like  & (i have such type of file =200,each
> file have below type of data)
>>t
> -0.15264004
> 0.056076439
> -0.07276116
> -0.00917326
> -0.02069089
> -0.00416232
> -0.07225855
> -0.02654577
> -0.06131410
> -0.09380202
> 0.057414014
> -0.05239976
> 0.014397612
> 0.016145161
> -0.00670587
> 0.018696335
> 0.036943654
> -0.02450233
> 0.031161705
> 0.006513503
> -0.02892329
> -0.00831519
> -0.00877744
> -0.00634399
> -0.02612019
> -0.02531800
> -0.01435533
> 0.011148840
> -0.01893775
> 0.029859128
> 0.029878797
> -0.00125987
> 0.031404385
> 0.035127606
> -0.00191775
> 0.059797202
> -0.03268047
> -0.06026960
> -0.02216465
> -0.08145612
> -0.02772806
> -0.03171683
> -0.02842562
> -0.11807898
> -0.01457311
> -0.12612482
> 0.409631265
> -0.06375234
>
>>plot.ts(t)
>
> i am new to time series,i get plot,but don't know which pattern this is?
> tell me how to determine type of time series(seasonal,trend,irregular,etc)
> & from 200 such file/plot ,which model should i select as best model for
> forecasting future events
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/how-to-find-best-model-of-time-series-tp4469296p4469296.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum results in a matrix

2012-03-13 Thread R. Michael Weylandt
res3 + t(res3)

Michael

On Tue, Mar 13, 2012 at 8:15 AM, RMSOPS  wrote:
> Hello,
>     With the following code get the results array
> res3<-table(df$v_source,df$v_destine)
>
>  1  2  3  4  5  6  7
>  1  0 10  0  0  0  0  0
>  2 11  0  0  0  0  0  0
>  3  0  0 18 15  0  0  0
>  4  0  0 15 11  0  0  0
>  5  0  0  0  0  1  0  0
>  6  0  0  0  0  0  1  0
>  7  0  0  0  0  0  0 18
>
>
> my idea was to create a new table of results from res3 but making the sum of
> the results where the position of the origin and destination were reversed.
>
>    for example
>         [1,2]= 10
>          [2,1]=11
>
>  New Table Final.
>  points         result
>  [1 and 2]    21
>
> thanks
>
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Sum-results-in-a-matrix-tp4468936p4468936.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting factor data into Date-time format

2012-03-13 Thread R. Michael Weylandt
as.POSIXct(as.character(FACTORHERE), format = "%d/%m/%Y %H:%M")

Michael

On Tue, Mar 13, 2012 at 12:20 PM, Haojie Yan  wrote:
> Dear R-user,
>
> I have read a dataset from .csv file into R. This dataset includes one
> column containing some data in 'date and time' format, e.g. 'dd/mm/
> hh:mm'.
>
> These data were automatically read and saved as 'factor' in R. When I was
> trying to produce some plots (such as time series) with the above 'date and
> time' on x-axis,  it caused some disodering problem, e.g. 1st of March 2012
> is in front of 10th of Feb. 2012 (if the data is from 10th Feb. 2012 to 1st
> of March 2012). I understand that I might have to convert them from
> 'factor' to 'date' first, so I tried using 'as.date'. But this method seems
> only work for data in format of  'd/m/y' and no further option that allows
> me to add hours and minutes.
>
> I checked online for other methods such as 'as.POSIX' and 'strptime' but
> none of them seem to offer me a quick solution.
>
> Please note that the data I received is recorded every 10 minutes so they
> are saved in the form of  'dd/mm/ hh:mm', e.g. I only have data
> measured up to 'minute' NOT to  'second'. Are there any direct solution
> that I can solve this issue??
>
>
> Many thanks in advance!
> HJ
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Converting factor data into Date-time format

2012-03-13 Thread Haojie Yan
Dear R-user,

I have read a dataset from .csv file into R. This dataset includes one
column containing some data in 'date and time' format, e.g. 'dd/mm/
hh:mm'.

These data were automatically read and saved as 'factor' in R. When I was
trying to produce some plots (such as time series) with the above 'date and
time' on x-axis,  it caused some disodering problem, e.g. 1st of March 2012
is in front of 10th of Feb. 2012 (if the data is from 10th Feb. 2012 to 1st
of March 2012). I understand that I might have to convert them from
'factor' to 'date' first, so I tried using 'as.date'. But this method seems
only work for data in format of  'd/m/y' and no further option that allows
me to add hours and minutes.

I checked online for other methods such as 'as.POSIX' and 'strptime' but
none of them seem to offer me a quick solution.

Please note that the data I received is recorded every 10 minutes so they
are saved in the form of  'dd/mm/ hh:mm', e.g. I only have data
measured up to 'minute' NOT to  'second'. Are there any direct solution
that I can solve this issue??


Many thanks in advance!
HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error " subscript out of bounds"

2012-03-13 Thread Houhou Li
Hello, R-users,
 
I have a datafile with 37313 records and each record has 5 different 
measurements on the same variables. The format looks like this: treeID, VIG0, 
VIG1, VIG2, VIG3, VIG4
I was trying to convert the one row record to 5 rows record with format like 
this (treeID, MEASUREMENT, VIGOR). My code like this: 
 
treeMeas<-matrix(data=0,nrow=(length(tree1$indivTree)*5), ncol=3)
colnames(treeMeas)<-c("indivTree", "meas", "vigor")
for(i in 1:length(tree1$indivTree))
{ 
  treeMeas[(i-1)*5+1:(i*5),1]<-tree1$indivTree[i]
  treeMeas[(i-1)*5+1:(i*5),2]<-c(0:4)
  treeMeas[(i-1)*5+1:(i*5),3]<-c(tree1$VIG0[i], tree1$VIG1[i], tree1$VIG2[i], 
tree1$VIG3[i], tree1$VIG4[i])
  }
 
When I run the code, I always got error message like this " Error in 
treeMeas[(i - 1) * 5 + 1:(i * 5), 1] <- tree1$indivTree[i] :   subscript out of 
bounds". I couldn't figure out why subscript out of bounds. Is this because the 
matrix is too big (186565 by 3)? Any one can help? Thank you very much.
 
Yuzhen
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help-es have reached 500 members!

2012-03-13 Thread Igor Sosa Mayor
:)

congratulations!

On Tue, Mar 13, 2012 at 08:59:56AM -0600, Kjetil Halvorsen wrote:
> This posting is only to celebrate that R-help-es (R-help for Spanish
> Speakers) have reached 500
> members!
> 
> (and to thank Patricia for doing the bulk of admin work).
> 
> Kjetil
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
:: Igor Sosa Mayor   :: joseleopoldo1...@gmail.com ::
:: GnuPG: 0x69804897 :: http://www.gnupg.org/  ::

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standard errors GLM

2012-03-13 Thread Rubén Roa
You have a conceptual problem, as pointed out by previous helpers.
You don't have a standard error for the first level of your categorical 
variable because that level's effect is not estimated.
It is being used as a reference level against which the other levels of that 
categorical variable are being estimated (the default in R).
This is one way by which statisticians include categorical predictors into the 
regression framework, originally meant for relations between continuous 
quantitative variables.
You might want to read about regression, factors, and contrasts.
This paper about the issue is available online:
M.J. Davis, 2010. Contrast coding in multiple regression analysis: strengths, 
weaknesses and utility of popular coding structures. Journal of Data Science 
8:61-73.
HTH
Ruben

-Mensaje original-
De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En 
nombre de D_Tomas
Enviado el: martes, 13 de marzo de 2012 14:39
Para: r-help@r-project.org
Asunto: [R] Standard errors GLM

Dear userRs, 

when applied the summary function to a glm fit (e.g Poisson) the parameter 
table provides the categorical variables assuming that the first level estimate 
(in alphabetical order) is 0. 

What is the standard error for that variable then? 

Are the standard errors calculated assuming a normal distribution?

Many thanks, 

 

--
View this message in context: 
http://r.789695.n4.nabble.com/Standard-errors-GLM-tp4469086p4469086.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Another table

2012-03-13 Thread RMSOPS
Hello,
  
this solve my problem. table(testdata$source, testdata$destine) 

   Thanks 

--
View this message in context: 
http://r.789695.n4.nabble.com/Re-Matrix-Another-table-tp4469376p4469384.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to find best model of time series?

2012-03-13 Thread sagarnikam123
i have data in one file below like  & (i have such type of file =200,each
file have below type of data)
>t
-0.15264004
0.056076439
-0.07276116
-0.00917326
-0.02069089
-0.00416232
-0.07225855
-0.02654577
-0.06131410
-0.09380202
0.057414014
-0.05239976
0.014397612
0.016145161
-0.00670587
0.018696335
0.036943654
-0.02450233
0.031161705
0.006513503
-0.02892329
-0.00831519
-0.00877744
-0.00634399
-0.02612019
-0.02531800
-0.01435533
0.011148840
-0.01893775
0.029859128
0.029878797
-0.00125987
0.031404385
0.035127606
-0.00191775
0.059797202
-0.03268047
-0.06026960
-0.02216465
-0.08145612
-0.02772806
-0.03171683
-0.02842562
-0.11807898
-0.01457311
-0.12612482
0.409631265
-0.06375234

>plot.ts(t)

i am new to time series,i get plot,but don't know which pattern this is?
tell me how to determine type of time series(seasonal,trend,irregular,etc)
& from 200 such file/plot ,which model should i select as best model for
forecasting future events




--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-find-best-model-of-time-series-tp4469296p4469296.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Results

2012-03-13 Thread RMSOPS
Hello


Error: could not find function sqldf:

Hello, I'm using R Studio, and installed the option of installing the
packages sqldbf function.
But When I run the code give the next error.


install.packages("sqldf")
library("RSQLite") 
require(sqldf)
 x <- read.fwf(textConnection("4 - 4   56
+ 4 - 3   61
+ 3 - 3   300
+ 3 - 327
+ 3 - 3   33
+ 3 - 3   87
+ 3 - 4  49
+ 4 - 4  71
+ 4 - 3 121
+ 3 - 4 138
+ 4 - 3  15"), width = c(7,8) , header = FALSE, as.is = TRUE)
closeAllConnections()
 sqldf("
+ select V1
+ , count(*) as Freq
+ , min(V2) as Min
+ , max(V2) as Max
+ , median(V2) as Median
+ from x
+ group by V1
+ ")


ERROR: lazy loading failed for package ‘sqldf’
* removing ‘/home/ricardosousa/R/x86_64-pc-linux-gnu-library/2.13/sqldf’
Warning in install.packages :
  installation of package 'sqldf' had non-zero exit status

The downloaded packages are in
‘/tmp/RtmpS53jrJ/downloaded_packages’



--
View this message in context: 
http://r.789695.n4.nabble.com/Matrix-Results-tp4468642p4469239.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROC Analysis

2012-03-13 Thread Camille Leclerc
Hi everybody,

I have a data set with a value and a status (positive or negative case) and
I want make a ROC Analysis. So, with ROCR Package, I have got the ROC curve
(True Positive Fraction [tpf] according 1-True Negative Fraction [1-tnf]).

http://r.789695.n4.nabble.com/file/n4469203/01.png 

But, now I want a new graphic which show the sum of true positive fraction
and true negative fraction according each value on my data set (tpf + tnf
according the values). 

http://r.789695.n4.nabble.com/file/n4469203/02.png 

If you have an idea !

Thank you very much for all help,
Camille Leclerc

--
Camille Leclerc, Master student
Lab ESE, UMR CNRS 8079
Univ Paris-Sud
Bat 362
F-91405  Orsay Cedex FRANCE



--
View this message in context: 
http://r.789695.n4.nabble.com/ROC-Analysis-tp4469203p4469203.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standard errors GLM

2012-03-13 Thread David Winsemius


On Mar 13, 2012, at 9:38 AM, D_Tomas wrote:


Dear userRs,

when applied the summary function to a glm fit (e.g Poisson) the  
parameter

table provides the categorical variables assuming that the first level
estimate (in alphabetical order) is 0.


Not really. It returns an estimate for the contrast of two Poisson  
parameters which have support on the real line. This is not really the  
correct list for fixing your misconceptions about GLMs. Your  
misconceptions are more of a conceptual character rather than an R  
coding problem. Maybe you should post follow-ups to:  
stats.stackexchange.com




What is the standard error for that variable then?


It (meaning I assume the coefficient estimate) is not a variable, at  
least not in the sense of being a data element.




Are the standard errors calculated assuming a normal distribution?


The standard errors are simply the square roots of the diagonals of  
the variance-covariance matrix (estimated from the deviations on the  
specified scale of the data from a best fit in a modeling framework).  
The assumption one makes when turning this into a confidence interval  
is that _parameters_ are approximately normally distributed using a  
glm method. You do not necessarily need to accept this method. The  
'confint' function in MASS will return CI's based on the profile  
likelihood.




Many thanks,



--
View this message in context: 
http://r.789695.n4.nabble.com/Standard-errors-GLM-tp4469086p4469086.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Another table

2012-03-13 Thread Sarah Goslee
Hi,

On Tue, Mar 13, 2012 at 7:51 AM, RMSOPS  wrote:
> I have next table
> source destine
> 3 3
> 7 7
> 6 6
> 3 4
>  4 4
>  4 3
> 3 3
> 3 3
>  3 3
>  3 3
> 3 4
>  4 4
>  4 3
>  3 4
>  4 3

It is so much easier if you use dput to provide reproducible data, as
the posting guide asks. There's no way for us to tell, for instance,
that you expect those to be factors with levels 1:7.

Here's how:

> dput(testdata)
structure(list(source = structure(c(3L, 7L, 6L, 3L, 4L, 4L, 3L,
3L, 3L, 3L, 3L, 4L, 4L, 3L, 4L), .Label = c("1", "2", "3", "4",
"5", "6", "7"), class = "factor"), destine = structure(c(3L,
7L, 6L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 3L, 4L, 3L), .Label = c("1",
"2", "3", "4", "5", "6", "7"), class = "factor")), .Names = c("source",
"destine"), row.names = c(NA, -15L), class = "data.frame")

Also, df is a base function, and thus a bad thing to name your data frame.

>  I'm trying to create an array with the number of occurrences between the
> source and destination. id_ap<-levels(factor(df$v_source))
> num_AP<-length(levels(factor(df$v_source)))
> mat<-matrix(data=NA,nrow=num_AP,ncol=num_AP,
> byrow=TRUE,dimnames=list(id_ap,id_ap))
> 1 2 3 4 5 6 7
> 1 NA NA NA NA NA NA NA
> 2 NA NA NA NA NA NA NA
> 3 NA NA NA 4 NA NA NA 4
> NA NA NA NA NA NA NA 5
> NA NA NA NA NA NA NA 6
> NA NA NA NA NA NA NA 7
> NA NA NA NA NA NA NA

Why not just use table?

> table(testdata$source, testdata$destine)

1 2 3 4 5 6 7
  1 0 0 0 0 0 0 0
  2 0 0 0 0 0 0 0
  3 0 0 5 3 0 0 0
  4 0 0 3 2 0 0 0
  5 0 0 0 0 0 0 0
  6 0 0 0 0 0 1 0
  7 0 0 0 0 0 0 1


> what better way to count the occurrences of the table, for example 3-4 = 4
> times

That I can't tell you, since there are only three occurences of 3-4 in
your data. If you want that value to be equal to 4, you'll need to
provide more information about your problem.

Sarah


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Standard errors GLM

2012-03-13 Thread Joshua Wiley
Hi,

See inline.

On Tue, Mar 13, 2012 at 6:38 AM, D_Tomas  wrote:
> Dear userRs,
>
> when applied the summary function to a glm fit (e.g Poisson) the parameter
> table provides the categorical variables assuming that the first level
> estimate (in alphabetical order) is 0.
>
> What is the standard error for that variable then?

That is not a variable per se.  Say you have a 3 level factor, the
default coding is to create two 1/0 vectors, and the parameter
estimates and standard errors are for those 'dummy' vectors.
Information about the reference group is encoded, there is not an
explicit estimate for it with a standard error (notable exception when
there are no other variables, the intercept is essentially the
reference group, and the other estimates are deviations from that).

>
> Are the standard errors calculated assuming a normal distribution?

I am not completely sure I know what you mean by this.  It is assumed
that the z value (Estimate/Std. Error) is normally distributed, if
that answers your question?

Cheers,

Josh

>
> Many thanks,
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Standard-errors-GLM-tp4469086p4469086.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R-help-es have reached 500 members!

2012-03-13 Thread Kjetil Halvorsen
This posting is only to celebrate that R-help-es (R-help for Spanish
Speakers) have reached 500
members!

(and to thank Patricia for doing the bulk of admin work).

Kjetil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] p-value of the pooled Z score

2012-03-13 Thread cheba meier
Hello,

I have to compute the pooled z-value and I would like to know which way is
more appropriate


b <- c( -0.205,1.040,0.087)
s <- c(0.449,0.167,0.241)
n <- c(310, 342, 348)
z <- b/s

Z <- sum(z)/sqrt(length(n))
P <- 2*(1-pnorm(abs(Z)))
P

w <- sqrt(n)
Zw <- sum(w * z)/sqrt(sum(w^2))
Pw <- 1 - pchisq(Zw * Zw, 1)
Pw


Many thanks in advance,
Cheba

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Definition of generic function for subclasses

2012-03-13 Thread Alexander
The definition of simple in "Keep your class hierarchy simple and relevant to
your actual problem" is quite difficult. For someone who has programmed the
classes etc, it is quite simple to understand the heritance. But for example
for someone else, who has to maintain the code and the classes, it might
not.

I know how to define classes, and subclasses and generic function (and I
know also the helpfunction "?"). It is more about the concept. What is the
best way to write good code which is easy to understand by others, easy to
maintain, but doesn't repeat the same generic function for every single
subclass... 

--
View this message in context: 
http://r.789695.n4.nabble.com/Definition-of-generic-function-for-subclasses-tp4468837p4469145.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standard errors GLM

2012-03-13 Thread D_Tomas
Dear userRs, 

when applied the summary function to a glm fit (e.g Poisson) the parameter
table provides the categorical variables assuming that the first level
estimate (in alphabetical order) is 0. 

What is the standard error for that variable then? 

Are the standard errors calculated assuming a normal distribution?

Many thanks, 

 

--
View this message in context: 
http://r.789695.n4.nabble.com/Standard-errors-GLM-tp4469086p4469086.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort list

2012-03-13 Thread sybil kennelly
Thanks Josh. I'm quite new, just wondering re:factor levels?

In this example (shamelessly stolen from the internet):

*schtyp*

[1] 0 0 1 0 0 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0

*schtyp.f <- factor(schtyp, labels = c("private", "public"))

schtyp.f*

[1] private private public private private private public private public
[10] private public public public public private private public public
[19] public private


Levels: private public



in my data i have a table:

var1var2 var3
cell1x   x x
cell2x   x x
cell3x   x x
cell4

.
.
.
.
cell100


and i have a subset of those cells that are interesting to me as a list of
data
list1 = ["cell1, "cell5",cell19", "cell50", "cell70"]

is it possible to create (similar to above):

*schtyp.f <- factor(schtyp, labels = c("special", "normal"))

so that when i plot this data, i can color the items in list1 as one
color (eg all the special cells are red), and the rest of the items as
a second color (eg all the other cells are black/blue)?

Syb
*



On Tue, Mar 13, 2012 at 11:48 AM, Joshua Wiley wrote:

> Hi Sybil,
>
> You cannot turn a list into a factor.  You could do:
>
> cell_data <-c('cell1','cell2')
> factor_list <- factor(cell_data)
>
> or if you already have a list, unlist() or as.vector() may convert it
> into a vector that you can then convert to a factor.
>
> Cheers,
>
> Josh
>
> On Tue, Mar 13, 2012 at 4:29 AM, sybil kennelly 
> wrote:
> > Hello can anyone help please?
> >
> > i read two words "cell1", "cell2" into a list. I want to turn this list
> > into a factor.
> >
> >> cell_data <-list(c('cell1','cell2'))
> >
> >
> >> cell_data
> > [[1]]
> > [1] "cell1" "cell2"
> >
> >
> >
> >> factor_list <- factor(cell_data)
> > Error in sort.list(y) : 'x' must be atomic for 'sort.list'
> > Have you called 'sort' on a list?
> >
> >
> >
> >> sort.list(cell_data)
> > Error in sort.list(cell_data) : 'x' must be atomic for 'sort.list'
> > Have you called 'sort' on a list?
> >
> >
> > Can anyone explain?
> >
> > Syb
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sum results in a matrix

2012-03-13 Thread RMSOPS
Hello,
 With the following code get the results array
res3<-table(df$v_source,df$v_destine)

  1  2  3  4  5  6  7
  1  0 10  0  0  0  0  0
  2 11  0  0  0  0  0  0
  3  0  0 18 15  0  0  0
  4  0  0 15 11  0  0  0
  5  0  0  0  0  1  0  0
  6  0  0  0  0  0  1  0
  7  0  0  0  0  0  0 18


my idea was to create a new table of results from res3 but making the sum of
the results where the position of the origin and destination were reversed.

for example
 [1,2]= 10
  [2,1]=11

 New Table Final.
 points result 
 [1 and 2]21
 
thanks
   



--
View this message in context: 
http://r.789695.n4.nabble.com/Sum-results-in-a-matrix-tp4468936p4468936.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] To generate a pmml for an "hclust" object, showing the error "Error in .Internal(inherits(x, what, which)) : 'x' is missing"

2012-03-13 Thread Yashwanth M.R
#
#

## Iris data ##

data(iris)

##
##

# Taken only 10-rows selecting all the variables ##

IRIS_DF <- data.frame(iris$Sepal.Length, iris$Sepal.Width,
iris$Petal.Length, iris$Petal.Width)
IRIS_DF_SR <- IRIS_DF[1:10,]

##
##

## DMC = Distance Matrix Computation ##

IRIS.DMC <- Dist(Iris_DF_SR, method = "euclidean", nbproc = 2, diag = FALSE,
upper = FALSE)

##
##

# "hclust" object. HCMA = HCLUST.METHOD.AVERAGE ##

IRIS_HCMA <- hclust(IRIS.DMC, method = "average", members = NULL)

##
##

# To generate a pmml for an "hclust" object  ##

pmml(IRIS_HCMA)

pmml(IRIS_HCMA, model.name="HClust_Model", app.name="Rattle/PMML",
 description="Hierarchical cluster model", copyright=NULL,
transforms=NULL, dataset=NULL)

*Error in .Internal(inherits(x, what, which)) : 'x' is missing*

##
##

For both of the functions, it is generating the above error. Please help me
finding the solution.



--
View this message in context: 
http://r.789695.n4.nabble.com/To-generate-a-pmml-for-an-hclust-object-showing-the-error-Error-in-Internal-inherits-x-what-which-x--tp4468931p4468931.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using caegorical variables in package randomForest.

2012-03-13 Thread abhishek
Hello,

I am sorry if there are already post that answers to this question but i
tried to find them before making this post. I did not really find relevant
posts.

I am using randomForest package for building a two class classifier. There
are categorical variables and numerical variables in my data. Different
categorical variables have different number of categories from 2 to 10. I am
not sure about how to represent the categorical data.
For example, I am using 0 and 1 for variables that have only two categories.
But, i doubt, the program is analysing the values as numerical. Do you have
any idea how can i use the c*ategorical variables for building a two class
classifier.* I am using a factor consisting of 0 and 1 for the
classification target.

Thank you for your ideas.

-
abhishek
--
View this message in context: 
http://r.789695.n4.nabble.com/Using-caegorical-variables-in-package-randomForest-tp4468923p4468923.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix Another table

2012-03-13 Thread RMSOPS
I have next table 
source destine 
3 3 
7 7 
6 6 
3 4
 4 4
 4 3 
3 3 
3 3
 3 3
 3 3 
3 4
 4 4
 4 3
 3 4
 4 3

 I'm trying to create an array with the number of occurrences between the
source and destination. id_ap<-levels(factor(df$v_source))
num_AP<-length(levels(factor(df$v_source)))
mat<-matrix(data=NA,nrow=num_AP,ncol=num_AP,
byrow=TRUE,dimnames=list(id_ap,id_ap)) 
1 2 3 4 5 6 7 
1 NA NA NA NA NA NA NA 
2 NA NA NA NA NA NA NA 
3 NA NA NA 4 NA NA NA 4 
NA NA NA NA NA NA NA 5 
NA NA NA NA NA NA NA 6 
NA NA NA NA NA NA NA 7 
NA NA NA NA NA NA NA 

what better way to count the occurrences of the table, for example 3-4 = 4
times

 Thanks 



--
View this message in context: 
http://r.789695.n4.nabble.com/Matrix-Another-table-tp4468875p4468875.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Results

2012-03-13 Thread RMSOPS
Hello

is the dataset that was sent to help, has over two columns the source and
destination, is the separation of position pos

POS   DIF   SourceDest
4 - 4   56 4  4
4 - 3   61 4 3  
3 - 3   300   3 3
3 - 3273 3
3 - 3   33 3 3
3 - 3   87 3 3
3 - 4  49  3 3  
4 - 4  71  4 3
4 - 3 1214 3
3 - 4 1383 3
4 - 3  15 4 3


--
View this message in context: 
http://r.789695.n4.nabble.com/Matrix-Results-tp4468642p4468900.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix Counts

2012-03-13 Thread RMSOPS
I have next table
 
source  destine
33
77
6   6
3   4
4   4
4   3
3   3
3   3
3   3
3   3
3   4
4   4
4   3
3   4
4   3


I'm trying to create an array with the number of occurrences between the
source and destination.
id_ap<-levels(factor(df$v_source))
num_AP<-length(levels(factor(df$v_source)))
mat<-matrix(data=NA,nrow=num_AP,ncol=num_AP,
byrow=TRUE,dimnames=list(id_ap,id_ap))

 1  2  3  4  5  6  7
1 NA NA NA NA NA NA NA
2 NA NA NA NA NA NA NA
3 NA NA NA 4 NA NA NA
4 NA NA NA NA NA NA NA
5 NA NA NA NA NA NA NA
6 NA NA NA NA NA NA NA
7 NA NA NA NA NA NA NA

what better way to count the occurrences of the table, for example 3-4 = 4
times

Thanks
  


--
View this message in context: 
http://r.789695.n4.nabble.com/Matrix-Counts-tp4468867p4468867.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Visualising multiple response contingency tables

2012-03-13 Thread Marcos Pelenur
Dear R Help Community,

I have a question and an answer (based on reading this forum and online
research), but I though I should share both since probably there's a much
better way to go about my solution. My question is specifically about how
to best visualise multiple response contingency tables. What I mean by
'multiple response' is that the total number of responses per row of a
contingency table will be greater than the total number of respondents. An
example of a multiple response table shown below (apologies if my
formatting is incorrect or silly, I'm a hardcore R newbie):

> f.tbl = structure(c(10, 15, 25, 45, 30, 50), .Dim = 2:3, .Dimnames = 
> structure(list(+ Sex = c("F", "M"), Responses = c("A", "B", "total 
> subjects"+  )), .Names = c("Sex", 
> "Responses")), class = "table")> f.tbl   Responses
Sex  A  B total subjects
  F 10 25 30
  M 15 45 50


The answer I have is to adjust my data and then use the mosaic() function
in package:vcd; however, I'm not sure that's the best way forward and I
don't have a very efficient way of getting there. I will present my
solution so you guys can take a look.

The fundamental problem is that because of the multiple response data, you
can't simply apply a normal Chi-square test to the contingency table.
There's a raft of approaches, but I've decided to use a simple technique
introduced by (A. Agresti, I. Liu, Modeling a categorical variable allowing
arbitrarily many category choices, Biometrics 55 (1999) 936-43.) and
refined by Thomas and Decady and Bilder and Loughin. In summary, the test
statistic (a modified Chi square statistic) is calculated by summing up the
individual chi-square statistics for each of the c marginal r × 2 tables
relating the single response variable to the multiple response variable
with df = c(r - 1)). Note, that instead of using the row totals (total
number of responses) the test statistic is calculated with the total number
of subjects per row.

(phew, I hope that made sense :) ) Unfortunately, my google-research has
not revealed an easy way to transform my one data table into c x r x 2
tables for analysis. So I end up having to create the two different tables
myself, shown below (note that the Not-A/B columns are calculated as the
difference between the main data column (A/B) and the total number of
subjects listed above.

> g.mtrx=matrix(c(10,15,20,35),nrow=2)> g.tbl=as.table(g.mtrx)> 
> dimnames(g.tbl)=list(Sex=c("F","M"),Responses=c("A","Not-A"))> g.tbl   
> Responses
Sex  A  Not-A
  F  10 20
  M  15 35

> h.tbl=as.table(h.mtrx)> h.mtrx=matrix(c(25,45,5,5),nrow=2)> 
> h.tbl=as.table(h.mtrx)> 
> dimnames(h.tbl)=list(Sex=c("F","M"),Responses=c("B","Not-B"))> h.tbl   
> Responses
Sex  B Not-B
  F 25 5
  M 45 5


If I then preform the normal Chi-square test on each of the two tables
(chisq.test()) and then sum up the results, I get the answer I want.
Clearly this is cumbersome, which is why I do it in Excel at the moment (I
know shame on me). However, I really want to take advantage of the mosaic
function in vcd. So what I have to do at the moment is create the tables
above and use abind() (package:abind) to bring my two matrices together to
form a multidimensional matrix. Example:

> gh.abind = abind(g.mtrx,h.mtrx,along=3)> 
> dimnames(gh.abind)=list(Sex=c("F","M"),Responses=c("Yes","No"),Factors=c("A","B"))>
>  gh.abind, , Factors = A

   Responses
Sex Yes No
  F  10 20
  M  15 35

, , Factors = B

   Responses
Sex Yes No
  F  25  5
  M  45  5

Now I can use the simple mosaic function to plot the combined matrix

> mosaic(gh.abind)

So that's it. I don't use any pearson-r shading in mosaic since I
don't think it would be appropriate to try and model my weird multiple
response tables (at the moment), but what I will do is look at the
odds-ratio table and then manually colour the mosaic cells with high
odds-ratios (greater than 2).

I am literally having to type all this by hand into R, and as you can
imagine, it gets cumbersome with large multi column tables (which I
have). Does any body have any thoughts on my approach of using mosaic
for this sort of data? And if so, any insight on how I can be a bit
slicker with my R code?

All help is appreciated and I hope that this question wasn't too long
to read through.

All the best,
Marcos




-- 
PhD Engineering Candidate
University of Cambridge
Department of Engineering
Centre for Sustainable Development
mp...@cam.ac.uk 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coding C++ in R. What is faster : Using bosst external libraries or R.h header file?

2012-03-13 Thread Ian Schiller
Thank you Michael,

Your advices are truly appreciated!

Ian

-Original Message-
From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com] 
Sent: 13 mars 2012 09:44
To: Ian Schiller
Cc: r-help@r-project.org
Subject: Re: [R] Coding C++ in R. What is faster : Using bosst external 
libraries or R.h header file?

There will be ever-so-slight performance differences due to implementation 
differences (I believe R's functions are just a hair slower because they are 
more exact -- though that may be a comparison with GSL I'm thinking of), but my 
advice would be to use the RNGs that come with R. They are the best in the 
business and you'll have the benefit of them being auto-upgraded with each new 
R release (as well as getting better support from the R lists).

What you really should look into is the Rcpp project. It provides nice wrappers 
to R's RNG functions and makes the whole porting process worlds easier: e.g., 
http://dirk.eddelbuettel.com/blog/2011/07/14/

Hope this helps,

Michael

On Tue, Mar 13, 2012 at 9:33 AM, Ian Schiller  
wrote:
> Hi everyone,
>
> I have built an R package and for the sake of speed I have decided to rewrite 
> some part of the code in C++.  In my original R code I use the pnorm, qnorm, 
> rnorm, pgamma, dgamma, rgamma, rbeta and runif function.  First I was 
> thinking in going with the boost libraries, but I noticed the functions 
> described above are available within the R.h header file (or is it Rmath.h?).
>
> So my question is the following.  Would my code be faster if I install the 
> appropriate boost libraries (distributions) or if I stick with R.h's 
> functions?
>
> Thanks!
>
>
> **
> 
> IAN SCHILLER, M.Sc.
>
> Statistical research assistant,
> Division of Clinical Epidemiology, McGill University Health Center
>
> Assistant de recherche en statistiques, Département d'Épidémiologie 
> Clinique, Centre Universitaire de Santé Mcgill
>
> Tel: 514 934 1934 ext. 36925
> Email: 
> ian.schil...@clinepi.mcgill.ca
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coding C++ in R. What is faster : Using bosst external libraries or R.h header file?

2012-03-13 Thread R. Michael Weylandt
There will be ever-so-slight performance differences due to
implementation differences (I believe R's functions are just a hair
slower because they are more exact -- though that may be a comparison
with GSL I'm thinking of), but my advice would be to use the RNGs that
come with R. They are the best in the business and you'll have the
benefit of them being auto-upgraded with each new R release (as well
as getting better support from the R lists).

What you really should look into is the Rcpp project. It provides nice
wrappers to R's RNG functions and makes the whole porting process
worlds easier: e.g., http://dirk.eddelbuettel.com/blog/2011/07/14/

Hope this helps,

Michael

On Tue, Mar 13, 2012 at 9:33 AM, Ian Schiller
 wrote:
> Hi everyone,
>
> I have built an R package and for the sake of speed I have decided to rewrite 
> some part of the code in C++.  In my original R code I use the pnorm, qnorm, 
> rnorm, pgamma, dgamma, rgamma, rbeta and runif function.  First I was 
> thinking in going with the boost libraries, but I noticed the functions 
> described above are available within the R.h header file (or is it Rmath.h?).
>
> So my question is the following.  Would my code be faster if I install the 
> appropriate boost libraries (distributions) or if I stick with R.h's 
> functions?
>
> Thanks!
>
>
> **
> IAN SCHILLER, M.Sc.
>
> Statistical research assistant,
> Division of Clinical Epidemiology, McGill University Health Center
>
> Assistant de recherche en statistiques,
> Département d'Épidémiologie Clinique, Centre Universitaire de Santé Mcgill
>
> Tel: 514 934 1934 ext. 36925
> Email: ian.schil...@clinepi.mcgill.ca
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Results

2012-03-13 Thread jim holtman
Try this:

> require(sqldf)
> x <- read.fwf(textConnection("4 - 4   56
+ 4 - 3   61
+ 3 - 3   300
+ 3 - 327
+ 3 - 3   33
+ 3 - 3   87
+ 3 - 4  49
+ 4 - 4  71
+ 4 - 3 121
+ 3 - 4 138
+ 4 - 3  15"), width = c(7,8) , header = FALSE, as.is = TRUE)
> closeAllConnections()
> sqldf("
+ select V1
+ , count(*) as Freq
+ , min(V2) as Min
+ , max(V2) as Max
+ , median(V2) as Median
+ from x
+ group by V1
+ ")
   V1 Freq Min Max Median
1 3 - 3  4  27 300   60.0
2 3 - 4  2  49 138   93.5
3 4 - 3  3  15 121   61.0
4 4 - 4  2  56  71   63.5
>


On Tue, Mar 13, 2012 at 5:45 AM, RMSOPS  wrote:
> Hello
>
>   I am developing a small program that to calculate the maximum, minimum
> and average.
>
>    The data.frame v is
>
> POS       DIF
> 4 - 4       56
> 4 - 3       61
> 3 - 3       300
> 3 - 3        27
> 3 - 3       33
> 3 - 3       87
> 3 - 4      49
> 4 - 4      71
> 4 - 3     121
> 3 - 4     138
> 4 - 3      15
>
>
> When execute  res<-table(df$v) gives this
>
> Var1    Freq
> 4 - 4      2
> 4 - 3      3
> 3 - 3     3
> 3 - 4     2
> 4 - 4    1
>
> If possible, my idea is that the result often present in addition to the
> minimum, maximum and average, all in a single array, for example.
>   what is the quickest way to do this
>
> Var1    Freq  Min    Max Med
> 4 - 4      2      56         71  
> 4 - 3      3      15       121
> 3 - 3     3              .
> 3 - 4     2     ...            ...
> 4 - 4    1    ...             
>
> Thanks
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Matrix-Results-tp4468642p4468642.html
> Sent from the R help mailing list archive at Nabble.com.
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Definition of generic function for subclasses

2012-03-13 Thread Martin Morgan

On 03/13/2012 04:34 AM, Alexander wrote:

Hi,
I am working on a project, which contains S4 classes and subclasses. Lets
assume the following organisation:
A: S4 Class
B,C: inherit from A
D,E,F,G: inherit from B
H,I: inherit from C

I want to define now a generic function, which returns me the name of the
class. I can now write the function with "A" in the signature. Is there any


?class for this function


reason to write the function for B,C, D,E,... with "B","C","D","E", .. in
the signature ? I can imagine that it can become a mess if for example I


Implement the method at the lowest (highest?) level in the hierarchy as 
possible -- so on "A" only for the example of class.



write a function for "A" and a special function for "I" (lets assume, I is
the only exception).


If 'I' is somehow special then write a method for it alone

  setClass("A"); setClass("C", "A"); setClass("I", "C")

  setGeneric("cls", function(x, ...) standardGeneric("cls"))
  setMethod(cls, "A", function(x, ...) as.vector(class(x)))
  setMethod(cls, "I", function(x, ...) tolower(as.vector(class(x

and then

> cls(new("C"))
[1] "C"
> cls(new("I"))
[1] "i"


Have you any best-work-pratice to share ?


Keep your class hierarchy simple and relevant to your actual problem.

Martin



Thanks
Alexander

--
View this message in context: 
http://r.789695.n4.nabble.com/Definition-of-generic-function-for-subclasses-tp4468837p4468837.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Coding C++ in R. What is faster : Using bosst external libraries or R.h header file?

2012-03-13 Thread Ian Schiller
Hi everyone,

I have built an R package and for the sake of speed I have decided to rewrite 
some part of the code in C++.  In my original R code I use the pnorm, qnorm, 
rnorm, pgamma, dgamma, rgamma, rbeta and runif function.  First I was thinking 
in going with the boost libraries, but I noticed the functions described above 
are available within the R.h header file (or is it Rmath.h?).

So my question is the following.  Would my code be faster if I install the 
appropriate boost libraries (distributions) or if I stick with R.h's functions?

Thanks!


**
IAN SCHILLER, M.Sc.

Statistical research assistant,
Division of Clinical Epidemiology, McGill University Health Center

Assistant de recherche en statistiques,
Département d'Épidémiologie Clinique, Centre Universitaire de Santé Mcgill

Tel: 514 934 1934 ext. 36925
Email: ian.schil...@clinepi.mcgill.ca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] File data to data.frame

2012-03-13 Thread jim holtman
try this:

> x <- readLines(textConnection("
+ "))
>
> closeAllConnections()
>
> # process & parse the data
> for (i in x){
+ if (grepl("^username", i)) username <- sub(".*=(.*)", '\\1', i)
+ if (grepl("^password", i)){
+ password <- sub(".*=(.*)", "\\1", i)
+ cat("found: user =", username, '  password =', password, '\n')
+ }
+ }
found: user = user   password = pass
found: user = user1   password = pass1
>


On Tue, Mar 13, 2012 at 6:09 AM, b4d  wrote:
> Hi, I am playing around with some data and I would like to get data that is
> stored in a file like this:
>
>  GET
> username=user
> password=pass
>>
>  GET
> username=user1
> password=pass1
>>
> ...
>
> to the data.frame structure, how can I do that directly in R, currently I am
> doing parse with bash, but I would like to centralize the procedure and
> learn something new.
>
> Thanks
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/File-data-to-data-frame-tp4468671p4468671.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort list

2012-03-13 Thread Joshua Wiley
On Tue, Mar 13, 2012 at 5:15 AM, sybil kennelly  wrote:
> Thanks Josh. I'm quite new, just wondering re:factor levels?
>
> In this example (shamelessly stolen from the internet):
>
> schtyp
>
> [1] 0 0 1 0 0 0 1 0 1 0 1 1 1 1 0 0 1 1 1 0
>
> schtyp.f <- factor(schtyp, labels = c("private", "public"))
>
> schtyp.f
>
> [1] private private public private private private public private public
> [10] private public public public public private private public public
>
> [19] public private
>
>
> Levels: private public
>
>
>
> in my data i have a table:
>
> var1var2 var3
> cell1x   x x
> cell2x   x x
> cell3x   x x
>
> cell4
>
> .
> .
> .
> .
> cell100
>
>
> and i have a subset of those cells that are interesting to me as a list of
> data
> list1 = ["cell1, "cell5",cell19", "cell50", "cell70"]
>
> is it possible to create (similar to above):
>
> schtyp.f <- factor(schtyp, labels = c("special", "normal"))

Sure.  Again, probably better to have cells of interest in a vector,
not a list a la:

list1 <- c("cell1, "cell5",cell19", "cell50", "cell70")

your_data$mycells <- factor(your_data$cells %in% list1, c("Special",
"NotSpecial"))

basically compares the cells to those in your list and returns
TRUE/FALSE, which is then converted to a factor, labeled, and stored.
If you are just starting, some background reading will help.  Here are
some suggestions:

1) Go here: http://www.burns-stat.com/pages/tutorials.html and read
the tutorials for R -- Beginning (this should not take more than 1
day).
2) Sit down and read:
http://cran.r-project.org/doc/manuals/R-intro.pdf through Appendix A
(for now you can probably skip the rest of the appendices).  That will
probably take another entire day or so.
3) Head back to Patrick Burn's website:
http://www.burns-stat.com/pages/tutorials.html and read the
intermediate guide, The R Inferno (1-3 days depending if you can read
for 8 hours straight or not)

Cheers,

Josh

>
> so that when i plot this data, i can color the items in list1 as one color
> (eg all the special cells are red), and the rest of the items as a second
> color (eg all the other cells are black/blue)?
>
>
> Syb
>
>
>
> On Tue, Mar 13, 2012 at 11:48 AM, Joshua Wiley 
> wrote:
>>
>> Hi Sybil,
>>
>> You cannot turn a list into a factor.  You could do:
>>
>> cell_data <-c('cell1','cell2')
>> factor_list <- factor(cell_data)
>>
>> or if you already have a list, unlist() or as.vector() may convert it
>> into a vector that you can then convert to a factor.
>>
>> Cheers,
>>
>> Josh
>>
>> On Tue, Mar 13, 2012 at 4:29 AM, sybil kennelly 
>> wrote:
>> > Hello can anyone help please?
>> >
>> > i read two words "cell1", "cell2" into a list. I want to turn this list
>> > into a factor.
>> >
>> >> cell_data <-list(c('cell1','cell2'))
>> >
>> >
>> >> cell_data
>> > [[1]]
>> > [1] "cell1" "cell2"
>> >
>> >
>> >
>> >> factor_list <- factor(cell_data)
>> > Error in sort.list(y) : 'x' must be atomic for 'sort.list'
>> > Have you called 'sort' on a list?
>> >
>> >
>> >
>> >> sort.list(cell_data)
>> > Error in sort.list(cell_data) : 'x' must be atomic for 'sort.list'
>> > Have you called 'sort' on a list?
>> >
>> >
>> > Can anyone explain?
>> >
>> > Syb
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> Programmer Analyst II, Statistical Consulting Group
>> University of California, Los Angeles
>> https://joshuawiley.com/
>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replacing values in Array

2012-03-13 Thread Gerrit Eichner

Hello, "uday",

e.g.,

(data - 1) %/% 3 + 1

would do the job in your very specific situation, but take a look at

?findInterval

for possibly more interesting (because more general) solutions.

Hth  --  Gerrit

On Tue, 13 Mar 2012, uday wrote:


I want to replace some values in my Array
e.g
data<- c(1,2,3,4,5,6,7,8,9,10,11,12)
I would like to replace 1,2,3 by 1 , 4,5,6 by 2, 7,8,9 by 3 and 10,11,12 by
4
I am expecting out put
data<- 1 1 1  2 2 2 3 3 3 4 4 4

I have tried replace function
replace(data, (data==1 |data==2| data==3),1)
replace(data, (data==4|data==5| data==6),2)
replace(data, (data==7|data==8| data==9),3)
replace(data, (data==10 |data==11| data==12),4)
but it changes only for individual operation it does not apply to whole
array.
So how I should replace values of whole array ?



--
View this message in context: 
http://r.789695.n4.nabble.com/replacing-values-in-Array-tp4468739p4468739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort list

2012-03-13 Thread Joshua Wiley
Hi Sybil,

You cannot turn a list into a factor.  You could do:

cell_data <-c('cell1','cell2')
factor_list <- factor(cell_data)

or if you already have a list, unlist() or as.vector() may convert it
into a vector that you can then convert to a factor.

Cheers,

Josh

On Tue, Mar 13, 2012 at 4:29 AM, sybil kennelly  wrote:
> Hello can anyone help please?
>
> i read two words "cell1", "cell2" into a list. I want to turn this list
> into a factor.
>
>> cell_data <-list(c('cell1','cell2'))
>
>
>> cell_data
> [[1]]
> [1] "cell1" "cell2"
>
>
>
>> factor_list <- factor(cell_data)
> Error in sort.list(y) : 'x' must be atomic for 'sort.list'
> Have you called 'sort' on a list?
>
>
>
>> sort.list(cell_data)
> Error in sort.list(cell_data) : 'x' must be atomic for 'sort.list'
> Have you called 'sort' on a list?
>
>
> Can anyone explain?
>
> Syb
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix Results

2012-03-13 Thread Petr PIKAL
Hi

> 
> Hello
> 
>I am developing a small program that to calculate the maximum, 
minimum
> and average.
> 
> The data.frame v is 
> 
> POS   DIF
> 4 - 4   56
> 4 - 3   61
> 3 - 3   300
> 3 - 327
> 3 - 3   33
> 3 - 3   87
> 3 - 4  49
> 4 - 4  71
> 4 - 3 121
> 3 - 4 138
> 4 - 3  15
> 
> 
> When execute  res<-table(df$v) gives this
> Â 
> Var1Freq
> 4 - 4  2
> 4 - 3  3
> 3 - 3 3
> 3 - 4 2
> 4 - 41
> 
> If possible, my idea is that the result often present in addition to the
> minimum, maximum and average, all in a single array, for example.
>what is the quickest way to do this
> 
> Var1Freq  MinMax Med
> 4 - 4  2  56 71  
> 4 - 3  3  15   121
> 3 - 3 3  .
> 3 - 4 2 ......
> 4 - 41... 

?aggregate

Regards
Petr


> 
> Thanks
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Matrix-
> Results-tp4468642p4468642.html
> Sent from the R help mailing list archive at Nabble.com.
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Definition of generic function for subclasses

2012-03-13 Thread Alexander
Hi,
I am working on a project, which contains S4 classes and subclasses. Lets
assume the following organisation:
A: S4 Class
B,C: inherit from A
D,E,F,G: inherit from B
H,I: inherit from C

I want to define now a generic function, which returns me the name of the
class. I can now write the function with "A" in the signature. Is there any
reason to write the function for B,C, D,E,... with "B","C","D","E", .. in
the signature ? I can imagine that it can become a mess if for example I
write a function for "A" and a special function for "I" (lets assume, I is
the only exception).

Have you any best-work-pratice to share ?

Thanks
Alexander

--
View this message in context: 
http://r.789695.n4.nabble.com/Definition-of-generic-function-for-subclasses-tp4468837p4468837.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sort list

2012-03-13 Thread sybil kennelly
Hello can anyone help please?

i read two words "cell1", "cell2" into a list. I want to turn this list
into a factor.

> cell_data <-list(c('cell1','cell2'))


> cell_data
[[1]]
[1] "cell1" "cell2"



> factor_list <- factor(cell_data)
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?



> sort.list(cell_data)
Error in sort.list(cell_data) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?


Can anyone explain?

Syb

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replacing values in Array

2012-03-13 Thread uday
I want to replace some values in my Array 
e.g 
data<- c(1,2,3,4,5,6,7,8,9,10,11,12) 
I would like to replace 1,2,3 by 1 , 4,5,6 by 2, 7,8,9 by 3 and 10,11,12 by
4 
I am expecting out put 
data<- 1 1 1  2 2 2 3 3 3 4 4 4 

I have tried replace function 
replace(data, (data==1 |data==2| data==3),1)
replace(data, (data==4|data==5| data==6),2)
replace(data, (data==7|data==8| data==9),3)
replace(data, (data==10 |data==11| data==12),4)
but it changes only for individual operation it does not apply to whole
array. 
So how I should replace values of whole array ? 



--
View this message in context: 
http://r.789695.n4.nabble.com/replacing-values-in-Array-tp4468739p4468739.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] size of graphs when using multiple figures by row

2012-03-13 Thread Nerak
Hi all,
I have a basic question concerning graphs in R.  I’m using the par()
function and I’m working with multiple figures by row (mfrow) but my the
hight of my figures become compressed. I have 4 rows and 2 columns (because
I want to plot 8 histograms (freq = FALSE ) on it. I know I can adapt my
margins with for example “oma” and “mai” but I don’t know how to choose the
size of the figure… I read something about pin. That this should go about
‘the current plot dimensions, (width, height), in inches. But I don’t know
which values I should use. Because I always get the remark ‘plot region too
large’.

I would like to have for example figures of 5,8 by 5,8 cm (is 2,3 by 2,3
inches I think). But I don’t know how to specify this…. I'm getting lost
with the par() function

Hard to give my data online for a reproducible example but with for example
this, I have the same problems:

par(mfrow=c(4,2),oma=c(2,2,2,2), mai=c(0.6, 0.6, 0.6, 0.6))
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen")
hist(islands, freq=FALSE,col="blue",main="test")
mtext("testen") 

(normally, I specify my breaks, ylab and xlab, xlim and y lim)

I hope someone can help me.
Many thanks,
Nerak


--
View this message in context: 
http://r.789695.n4.nabble.com/size-of-graphs-when-using-multiple-figures-by-row-tp4468610p4468610.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with plot Grouped Bar Plot by using R

2012-03-13 Thread R_beginner_starter
Hi, Jim

Below is the code that I try and the result I obtained:

*br<-read.table("R_beginner_starter.dat",header=TRUE,sep="\t") 
library(plotrix) 
barp(t(br[,c(2,4)])) 
*
The result generated:
http://r.789695.n4.nabble.com/file/n4468592/ScreenHunter_01_Mar._13_17.10.jpg 

Which is different with my desired output bar graph :(
Do you have any idea to solve it out?

Thanks for advice.

--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-plot-Grouped-Bar-Plot-by-using-R-tp4448762p4468592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix Results

2012-03-13 Thread RMSOPS
Hello

   I am developing a small program that to calculate the maximum, minimum
and average.

The data.frame v is 

POS   DIF
4 - 4   56
4 - 3   61
3 - 3   300
3 - 327
3 - 3   33
3 - 3   87
3 - 4  49
4 - 4  71
4 - 3 121
3 - 4 138
4 - 3  15


When execute  res<-table(df$v) gives this
 
Var1Freq
4 - 4  2
4 - 3  3
3 - 3 3
3 - 4 2
4 - 41

If possible, my idea is that the result often present in addition to the
minimum, maximum and average, all in a single array, for example.
   what is the quickest way to do this

Var1Freq  MinMax Med
4 - 4  2  56 71  
4 - 3  3  15   121
3 - 3 3  .
3 - 4 2 ......
4 - 41... 

Thanks



--
View this message in context: 
http://r.789695.n4.nabble.com/Matrix-Results-tp4468642p4468642.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] File data to data.frame

2012-03-13 Thread b4d
Hi, I am playing around with some data and I would like to get data that is
stored in a file like this:



...

to the data.frame structure, how can I do that directly in R, currently I am
doing parse with bash, but I would like to centralize the procedure and
learn something new.

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/File-data-to-data-frame-tp4468671p4468671.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manipulate strings reordering some elements

2012-03-13 Thread Igor Sosa Mayor
Berend:

1. 1000 thanks for your help. It works perfectly
2. many thanks for the analysis of the expression; i will try to
understand it. Perl is really not easy to read

thanks again!

On Tue, Mar 13, 2012 at 11:51:49AM +0100, Berend Hasselman wrote:
> 
> On 13-03-2012, at 11:28, Igor Sosa Mayor wrote:
> 
> > many thanks, Berend.
> > 
> > It works well... but with a problem because i was not completely clear
> > in my first email.
> > 
> > It works with cases such as:
> > Franco (El)
> > Regueras (Las)
> > 
> > but not with other cases such as:
> > Fauces de San Andrés (Las)
> > 
> > any hints? or meaybe if it is very complicated, any short explanation of
> > the perl expression you wrote (as far as I can understand, the point is
> > this \\3 \\1...).
> 
> 
> Try this
> 
> gsub("([^\\(]+)(\\()(.*)(\\))","\\3 \\1", municipios, perl=TRUE)
> 
> ([^\\(]+)   look for a sequence of characters not (, this becomes the \1
> 
> (\\() match (
> 
> (.*)  match zero or more anything, this becomes \3
> 
> (\\))match closing )
> 
> All subexpressions surrounded by ()  for backreferencing in replacement 
> expression to work.
> 
> The result of the above expression will contain trailing blanks if there was 
> a (El) etc.
> You can get rid of those by using gsub("\\s+$","",x)
> 
> I'm in a bit of a hurry now, so I won't be able to answer further questions 
> for several hours.
> 
> Berend
> 

-- 
:: Igor Sosa Mayor   :: joseleopoldo1...@gmail.com ::
:: GnuPG: 0x69804897 :: http://www.gnupg.org/  ::

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manipulate strings reordering some elements

2012-03-13 Thread Berend Hasselman

On 13-03-2012, at 11:28, Igor Sosa Mayor wrote:

> many thanks, Berend.
> 
> It works well... but with a problem because i was not completely clear
> in my first email.
> 
> It works with cases such as:
> Franco (El)
> Regueras (Las)
> 
> but not with other cases such as:
> Fauces de San Andrés (Las)
> 
> any hints? or meaybe if it is very complicated, any short explanation of
> the perl expression you wrote (as far as I can understand, the point is
> this \\3 \\1...).


Try this

gsub("([^\\(]+)(\\()(.*)(\\))","\\3 \\1", municipios, perl=TRUE)

([^\\(]+)   look for a sequence of characters not (, this becomes the \1

(\\() match (

(.*)  match zero or more anything, this becomes \3

(\\))match closing )

All subexpressions surrounded by ()  for backreferencing in replacement 
expression to work.

The result of the above expression will contain trailing blanks if there was a 
(El) etc.
You can get rid of those by using gsub("\\s+$","",x)

I'm in a bit of a hurry now, so I won't be able to answer further questions for 
several hours.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manipulate strings reordering some elements

2012-03-13 Thread Igor Sosa Mayor
many thanks, Berend.

It works well... but with a problem because i was not completely clear
in my first email.

It works with cases such as:
Franco (El)
Regueras (Las)

but not with other cases such as:
Fauces de San Andrés (Las)

any hints? or meaybe if it is very complicated, any short explanation of
the perl expression you wrote (as far as I can understand, the point is
this \\3 \\1...).

thx again

On Tue, Mar 13, 2012 at 11:12:24AM +0100, Berend Hasselman wrote:
> 
> On 13-03-2012, at 10:42, Igor Sosa Mayor wrote:
> 
> > Hi R-Users,
> > 
> > I want to manipulate some strings in the following way. I have the
> > following vector with spanish municipalities:
> > 
> > municipios<-c("Allande", "Aller", "Amieva", "Avilés", "Belmonte de
> > Miranda", 
> > "Degaña", "Franco (El)", "Gijón", "Gozón", "Grado", "Grandas de Salime", 
> > "Quirós", "Regueras (Las)", "Ribadedeva", "Ribadesella", "Ribera de
> > Arriba")
> > 
> > The problem is: some names have an article ("Franco (El)", "Regueras
> > (Las)"). Others don't. I want to do the following conversion:
> > 
> > "Regueras (Las)"---> "Las Regueras"
> > 
> > That is: I want to loop through the names, look whether they have a
> > postponed article, extract and delete this article and put it in front
> > of the rest of the name.
> > 
> > Any hints? Thanks in advance.
> 
> 
> gsub("([^\\s]+)\\s*(\\()(.*)(\\))","\\3 \\1", municipios, perl=TRUE)
> 
> 
> Berend
> 

-- 
:: Igor Sosa Mayor   :: joseleopoldo1...@gmail.com ::
:: GnuPG: 0x69804897 :: http://www.gnupg.org/  ::

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] customizing help, how to replace r.css of all packages

2012-03-13 Thread Joshua Wiley
Hi,

Probably yes, but I think this would be more easily handled through
the command prompt or some shell.  On the command prompt, I would look
at "xcopy".  xcopy /?  will show the arguments available.  You will
need to do a bit of work to get a list of all the directories to copy
your r.css file into, but this is all fairly straightforward in a
shell with commands like dir find etc. and then you can pipe it to
your copying command.

My 2 cents.

Cheers,

Josh

On Tue, Mar 13, 2012 at 2:52 AM, SNV Krishna  wrote:
> Hi All,
>
>
>
> I would like to replace default r.css that is found in "C:\Program
> Files\R\R-2.14.2\doc\html" and "C:\Program Files\R\R-2.14.2\library\ name>\html" with a custom r.css file. I have some 71 packages installed and
> want to replace r.css across all packages.
>
>
>
> Windows explorer can aggregate the files into a single window based on
> search criteria, but couldn't replace them.
>
>
>
> Is it possible to do this through R-software?
>
>
>
> Thanks in advance for the input and help.
>
>
>
> Regards,
>
>
>
> S.N.V. Krishna
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2 images on one plot

2012-03-13 Thread Petr PIKAL
Thanks Jim

For the first glance it seems to do what I want. I must go through it more 
thoroughly.

Petr

> 
> On 03/13/2012 03:07 AM, Petr PIKAL wrote:
> > Dear all
> >
> > with image I can plot only one set of values in one plot.
> >
> > Do somebody have any insight how to put those 2 matrices into one 
picture
> > so that in one cell in image picture are both values from mat[1,1] and
> > mat2[1,1].
> >
> > mat<-matrix(1:4, 2,2)
> > mat2<-matrix(4:1,2,2)
> > x<-1:2
> > y<-1:2
> > image(x, y, mat)
> > image(x, y, mat2)
> >
> > The only way I found is to mix x or y for both matrices  let say
> >
> > xm<- sort(c(x,x+.5))
> > matm<- cbind(mat[,1], mat2[,1], mat[,2], mat2[,2])
> > image(xm,y,t(matm))
> >
> > which lacks of elegance and is rather complicated when considering 
matrix
> > with more rows and columns.
> >
> Hi Petr,
> I don't know whether this will be of any help, but it gets any two 
> matrices in one plot:
> 
> interdigitate<-function(x1,x2,columns=TRUE) {
>   dimx<-dim(x1)
>   if(columns) {
>newx<-cbind(x1[,1],x2[,1])
>for(column in 2:dimx[2]) newx<-cbind(newx,x1[,column],x2[,column])
>   }
>   else {
>newx<-rbind(x1[1,],x2[1,])
>for(row in 2:dimx[1]) newx<-rbind(newx,x1[row,],x2[row,])
>   }
>   return(newx)
> }
> mat1<-matrix(1:9,nrow=3)
> mat2<-matrix(11:19,nrow=3)
> library(plotrix)
> color2D.matplot(interdigitate(mat1,mat2))
> color2D.matplot(interdigitate(mat1,mat2),columns=FALSE)
> 
> Obviously a bit of work is required to improve the elegance.
> 
> Jim
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >