Re: [R] identical values not so identical? newbie help please!
Quite fascinating, if annoying. Nice example Petr! Turns out my expected values are causing even more trouble because of this! I've even gotten negative chi square values (calculated using Cressie and Read's formula)! So instead of kludging the error measurement code, I think I'm going to have to round the actual expected values. Like exp <- round(exp, digits=10) Are there any ethical reservations to doing this? -- View this message in context: http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346880.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identical values not so identical? newbie help please!
Aaah, it truly is wonderful, this technology! I guess I'm going to have to override it a bit though.. Along the lines of tae <- ifesle(all.equal(obs, exp) == TRUE, 0, sum(abs(obs - exp))) Do I like doing this? No. But short of reading the vast literature that exists on calculation precision - which would quite possibly result in me ending up using the same kludge as above - this is as satisfying a solution as I can hope for! Thanks again guys! Maja -- View this message in context: http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346649.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identical values not so identical? newbie help please!
Thanks Josh and Dan! I did figure it had something to do with the machine epsilon... But so what do I do now? I'm calculating the total absolute error over thousands of tables e.g.: tae<-sum(abs(obs-exp)) Is there any easy way to I keep these ignorable errors from showing up? And furthermore, why does this happen only sometimes? The two (2D) tables I attached are actually just one 'layer' in a 3D table. And only 2 out of about 400 layers had this happen, all the other ones are identical - perfectly! And out of 2000 3D tables, about 60 of which should have no error, only 10 actually show an error of zero, and in the rest this same thing happens in a few layers. OK, this is a bit messy for a real question. I mean I can just round down all the errors that are under 1e-8 or something, but I'd much rather this not happen in the first place? Thanks again to the two posters for bothering with me! Maja. -- View this message in context: http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346516.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] identical values not so identical? newbie help please!
Hi there! I'm not sure I can create a minimal example of my problem, so I'm linking to a minimal .RData file that has only two objects: obs and exp, each is a 6x9 matrix. http://dl.dropbox.com/u/10364753/test.RData link to dropbox file (I hope this is acceptable mailing list etiquette!) Here's what happens: > obs[1, 1] [1] 118 > exp[1, 1] [1] 118 > obs[1, 1]-exp[1, 1] [1] 2.842171e-14 Problem is, both obs and exp should be identical. They are the result of a saturated loglinear model, and I've run the same code across about 400 tables, all of which result in sum(obs-exp)=0, except for this one. I can't figure it out? Anyway, I need help understanding why 118 and 118 are not really the same. I appreciate some may be wary of downloading my .Rdata file (I'm on ubuntu if that's any consolation), but I don't know how else to ask this quesiton! Thanks! Maja Z. -- View this message in context: http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346078.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting functions of chi square
OK, for the record, this is not my homework, thanks for asking! Also, I am sure I could have phrased my question more eloquently, but (hence the newbie qualifier) I didn't. The code I posted was for the plot I want, only smoothed i.e not based on random sampling from the distribution. Dennis: I tried that :) but your code divides the densities by df. I want the density of X^2/df Rookie: Same thing as David before - I know how to plot chi squared densities with different dfs! David: looks great! It's only the "playing around" that is off-putting... (sorry again for not explaining well, but illustrate I definitely did!) Ben & William: Thank you! Jointly you managed to plot exactly what I wanted and show me why and how so I can do it to more complicated functions! And just to prove you guys right, here's what I really wanted to plot - but refrained from mentioning in my original post: how by the central limit theorem for large df chi^2 approaches normality with a mean of df and variance of 2*df. d2chisq <- function(x,df) { dchisq(x*sqrt(2*df)+df,df)*sqrt(2*df) } plot(1, type="n", xlab="", ylab="", xlim=c(-3,3), ylim=c(0,0.5)) for (i in c(5,10,50,100,200,500)){ curve(d2chisq(x,i),add=TRUE) } lines(seq(-3,3,.1),dnorm(seq(-3,3,.1),0,1 ), col="red") Not bad considering I had to look up the chain rule on Wikipedia ;) Thanks again guys! maja. -- View this message in context: http://r.789695.n4.nabble.com/plotting-functions-of-chi-square-tp2329020p2329213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting functions of chi square
Thanks, but that wasn't what I was going for. Like I said, I know how to do a simple chi-square density plot with dchisq(). What I'm trying to do is chi-square / degrees of freedom. Hence rchisq(10,i)/i). How do I do that with dchisq? -- View this message in context: http://r.789695.n4.nabble.com/plotting-functions-of-chi-square-tp2329020p2329057.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting functions of chi square
Hi! This is going to be a real newbie question, but I can't figure it out. I'm trying to plot densities of various functions of chi-square. A simple chi-square plot I can do with dchisq(). But e.g. chi.sq/degrees of freedom I only know how to do using density(rchisq()/df). For example: plot(1, type="n", xlab="", ylab="", xlim=c(0,2), ylim=c(0,7)) for (i in c(10,50,100,200,500)){ lines(density(rchisq(10,i)/i)) } But even with 100,000 samples the curves still aren't smooth. Surely there must be a more elegant way to do this? Thanks! Maja -- View this message in context: http://r.789695.n4.nabble.com/plotting-functions-of-chi-square-tp2329020p2329020.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function to set log(0)=0 not working on tables or vectors
Peter, you're right about the error - I had R commander open and used the terminal instead - this makes me miss the error messages. Not that I would have known how to solve it had I seen it :) And yes, ifelse does work. Not sure I understand what the difference is, but thanks! David, I had no idea god could change the laws of mathematics. If that's true then you have to choose between believing in on or the other? Nice one! Of course Ben is right, the convention is 0*log(0)=0 but for the purposes of programming it in an entropy function I only need to define log(0)=0. I apologize for not being more precise. Thanks guys! Maja. Ben Bolker wrote: > > David Winsemius comcast.net> writes: > >> >> >> On Jan 17, 2010, at 8:17 PM, maiya wrote: >> >> > >> > There must be a very basic thing I am not getting... >> > >> > I'm working with some entropy functions and the convention is to use >> > log(0)=0. >> > >> >> I suppose the outcome of that effort may depend on whether you have >> assumed the needed godlike capacities to change the laws of >> mathematics. But I suppose that as the Earth mother that might occur >> to you. Go ahead, define a new mathematics. > > My guess is that the real intention here is > to define 0*log(0) = 0 rather than log(0) = 0 -- > really the assertion is that lim(x -> 0) x log(x) = 0, > which must be true for some reasonable limiting conditions. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://n4.nabble.com/function-to-set-log-0-0-not-working-on-tables-or-vectors-tp1016278p1016724.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function to set log(0)=0 not working on tables or vectors
There must be a very basic thing I am not getting... I'm working with some entropy functions and the convention is to use log(0)=0. So I wrote a function: llog<-function(x){ if (x ==0) 0 else log(x) } which seems to work fine for individual numbers e.g. >llog(0/2) [1] 0 but if I try whole vectors or tables: p<-c(4,3,1,0) q<-c(2,2,2,2) llog(p/q) I get this: [1] 0.6931472 0.4054651 -0.6931472 -Inf What am I missing? Thanks! Maja -- View this message in context: http://n4.nabble.com/function-to-set-log-0-0-not-working-on-tables-or-vectors-tp1016278p1016278.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] count number of empty cells in a table/matrix/data.frame
Hi everyone! This is a ridiculously simple problem, I just can't seem to find the solution! All I need is something equivalent to sum(is.na(x)) but instead of counting missing values, to count empty cells (with a value of 0). A naive attempt with is.empty didn't work :) Thanks! Maja Oh, and if the proposed solution would be to make all the empty cells into missing cells, that is not an option! There are over 20,000,000 cells in my table, and I don't think my computer is in the mood to store two such objects! -- View this message in context: http://n4.nabble.com/count-number-of-empty-cells-in-a-table-matrix-data-frame-tp947740p947740.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size...
Cool! Thanks for the sampling and ff tips! I think I've figured it out now using sampling... I'm getting a quad-core, 4GB RAM computer next week, will try it again using a 64 bit version :) Thanks for your time!!! Maja tlumley wrote: > > On Tue, 10 Nov 2009, maiya wrote: > >> >> OK, it's the simple math that's confusing me :) >> >> So you're saying 2.4GB, while windows sees the data as 700KB. Why is that >> different? > > Your data are stored on disk as a text file (in CSV format, in fact), not > as numbers. This can take up less space. > >> And lets say I could potentially live with e.g. 1/3 of the cases - that >> would make it .8GB, which should be fine? But then my question is if >> there >> is any way to sample the rows in read.table? Or what would be the best >> way >> of importing a random third of my cases? > > A better solution is probably to read a subset of the columns at a time. > The easiest way to do this is probably to read the data into a SQLite > database with the 'sqldf' package, but another solution is to use the > colClasses= argument to read.table() and specify "NULL" for the classes of > the columns you don't want to read. There are other ways as well. > > It might even be faster to do the cross-tabulations in a database and read > the resulting summaries into R to compute any statistics you need. > >> Thanks! >> >> M. >> >> >> >> jholtman wrote: >>> >>> A little simple math. You have 3M rows with 100 items on each row. >>> If read in this would be 300M items. If numeric, 8 bytes/item, this >>> is 2.4GB. Given that you are probably using a 32 bit version of R, >>> you are probably out of luck. A rule of thumb is that your largest >>> object should consume at most 25% of your memory since you will >>> probably be making copies as part of your processing. >>> >>> Given that, is you want to read in 100 variables at a time, I would >>> say your limit would be about 500K rows to be reasonable. So you have >>> a choice; read in fewer rolls, read in all 3M rows but at 20 columns >>> per read, put the data in a database and extract what you need. >>> Unless you go to a 64-bit version of R you will probably not be able >>> to have the whole file in memory at one time. >>> >>> On Tue, Nov 10, 2009 at 7:10 AM, maiya wrote: >>>> >>>> I'm trying to import a table into R the file is about 700MB. Here's my >>>> first >>>> try: >>>> >>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE) >>>> >>>> Error: cannot allocate vector of size 15.6 Mb >>>> In addition: Warning messages: >>>> 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >>>> : >>>> Reached total allocation of 1535Mb: see help(memory.size) >>>> 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >>>> : >>>> Reached total allocation of 1535Mb: see help(memory.size) >>>> 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >>>> : >>>> Reached total allocation of 1535Mb: see help(memory.size) >>>> 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >>>> : >>>> Reached total allocation of 1535Mb: see help(memory.size) >>>> >>>> Then I tried >>>> >>>>> memory.limit(size=4095) >>>> and got >>>> >>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE) >>>> Error: cannot allocate vector of size 11.3 Mb >>>> >>>> but no additional errors. Then optimistically to clear up the >>>> workspace: >>>> >>>>> rm() >>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE) >>>> Error: cannot allocate vector of size 15.6 Mb >>>> >>>> Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, >>>> 11.3Mb? >>>> I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable >>>> memory is usually 2Gb. Surely they mean GB? >>>> >>>> The file I'm importing has about 3 million cases with 100 variables >>>> that >>>> I >>>> want to crosstabulate each with each. Is this completely unrealistic? >>>> >>>> Thanks! >>>> >>>&
Re: [R] Error: cannot allocate vector of size...
OK, it's the simple math that's confusing me :) So you're saying 2.4GB, while windows sees the data as 700KB. Why is that different? And lets say I could potentially live with e.g. 1/3 of the cases - that would make it .8GB, which should be fine? But then my question is if there is any way to sample the rows in read.table? Or what would be the best way of importing a random third of my cases? Thanks! M. jholtman wrote: > > A little simple math. You have 3M rows with 100 items on each row. > If read in this would be 300M items. If numeric, 8 bytes/item, this > is 2.4GB. Given that you are probably using a 32 bit version of R, > you are probably out of luck. A rule of thumb is that your largest > object should consume at most 25% of your memory since you will > probably be making copies as part of your processing. > > Given that, is you want to read in 100 variables at a time, I would > say your limit would be about 500K rows to be reasonable. So you have > a choice; read in fewer rolls, read in all 3M rows but at 20 columns > per read, put the data in a database and extract what you need. > Unless you go to a 64-bit version of R you will probably not be able > to have the whole file in memory at one time. > > On Tue, Nov 10, 2009 at 7:10 AM, maiya wrote: >> >> I'm trying to import a table into R the file is about 700MB. Here's my >> first >> try: >> >>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE) >> >> Error: cannot allocate vector of size 15.6 Mb >> In addition: Warning messages: >> 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >> : >> Reached total allocation of 1535Mb: see help(memory.size) >> 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >> : >> Reached total allocation of 1535Mb: see help(memory.size) >> 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >> : >> Reached total allocation of 1535Mb: see help(memory.size) >> 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, >> : >> Reached total allocation of 1535Mb: see help(memory.size) >> >> Then I tried >> >>> memory.limit(size=4095) >> and got >> >>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE) >> Error: cannot allocate vector of size 11.3 Mb >> >> but no additional errors. Then optimistically to clear up the workspace: >> >>> rm() >>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE) >> Error: cannot allocate vector of size 15.6 Mb >> >> Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? >> I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable >> memory is usually 2Gb. Surely they mean GB? >> >> The file I'm importing has about 3 million cases with 100 variables that >> I >> want to crosstabulate each with each. Is this completely unrealistic? >> >> Thanks! >> >> Maja >> -- >> View this message in context: >> http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: cannot allocate vector of size...
I'm trying to import a table into R the file is about 700MB. Here's my first try: > DD<-read.table("01uklicsam-20070301.dat",header=TRUE) Error: cannot allocate vector of size 15.6 Mb In addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : Reached total allocation of 1535Mb: see help(memory.size) Then I tried > memory.limit(size=4095) and got > DD<-read.table("01uklicsam-20070301.dat",header=TRUE) Error: cannot allocate vector of size 11.3 Mb but no additional errors. Then optimistically to clear up the workspace: > rm() > DD<-read.table("01uklicsam-20070301.dat",header=TRUE) Error: cannot allocate vector of size 15.6 Mb Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb? I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable memory is usually 2Gb. Surely they mean GB? The file I'm importing has about 3 million cases with 100 variables that I want to crosstabulate each with each. Is this completely unrealistic? Thanks! Maja -- View this message in context: http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stars (as fourfold plots) in plot (symbols don't work)
I was feeling pretty silly when I saw there was actually a locations parameter in stars, as well as axes etc. But now the problem is that the x and y axes in stars must be on the same scale! Which unfortunately makes my data occupy only a very narrow band of the plot. I guess one option would be to scale one set of coordinates and then manually change the axis labels!? But I'll have a look at my.symbols first. Thanks for the tip! Maja Greg Snow-2 wrote: > > Here are 2 useful paths (it's up to you to decide if either is the right > path). > > The my.symbols function in the TeachingDemos package allows you to create > your own functions to create the symbols. > > But in this case, you can just use the locations argument to the stars > function: > >> stars(cbind(1,sqrt(test[,3]), 1, sqrt(test[,3]))/16, >> locations=test[,1:2], > + col.segments=c("gray90", "gray"),draw.segments=TRUE, scale=FALSE) > > -- > Gregory (Greg) L. Snow Ph.D. > Statistical Data Center > Intermountain Healthcare > greg.s...@imail.org > 801.408.8111 > > >> -Original Message- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- >> project.org] On Behalf Of maiya >> Sent: Saturday, June 06, 2009 4:03 PM >> To: r-help@r-project.org >> Subject: [R] stars (as fourfold plots) in plot (symbols don't work) >> >> >> Hi! >> >> I have a dataset with three columns -the first two refer to x and y >> coordinates, the last one are odds ratios. >> I'd like to plot the data with x and y coordinates and the odds ratio >> shown >> as a fourfold plot, which I prefer to do using the stars function. >> >> Unfortunately the stars option in symbols is not as cool as the stars >> function on its own, and now i can't figure out how to do it! >> >> here's an example code: >> #data >> test<-cbind(c(1,2,3,4), c(1,2,3,4), c(2,4,8,16)) >> #this is what I want the star symbol to look like >> stars(cbind(1,sqrt(test[1,3]), 1, sqrt(test[1,3])), >> col.segments=c("gray90", "gray"),draw.segments=TRUE, scale=FALSE) >> #this is what happens when using stars in symbols >> symbols(test[,1], test[,2], stars=cbind(1,sqrt(test[,3]), 1, >> sqrt(test[,3]))) >> >> Can anyone set me on the right path please? >> >> Maja >> -- >> View this message in context: http://www.nabble.com/stars-%28as- >> fourfold-plots%29-in-plot-%28symbols-don%27t-work%29- >> tp23905987p23905987.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/stars-%28as-fourfold-plots%29-in-plot-%28symbols-don%27t-work%29-tp23905987p23933876.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ridiculous behaviour printing to eps: labels all messed
Wow! Thank you for that Ted, a wonderfully comprehensive explanation and now everything makes perfect sense!! Regarding your last point, I would love to hear other people's experience. I myself, as a complete newbie in both R and LaTeX, am perhaps not the best judge... But there are several graphics packages that can be used directly in LaTeX to do what you propose (the Latex Graphics Companion that I own is about 1000 pages worth of material to help you not be able to make up your mind..). I have found postscript the easiest and most intuitive and you can write postscript graphics directly in Latex using the pstricks package. So yes, you are right, I could just use the data from R directly (and I hope that when I become a dinosaur I will be able to create graphs just as beautiful as yours!). But there are R plots that I would rather not attempt to code myself, in particular mosaic plots, so I prefer to import them from R as eps files and then use psfrag to get the nice LaTeX typesetting for the labels, equations etc. to make it "fit" visually. But then with the sheer volume of options figuring out what is the optimal combination for a particular application, or whether the learning curve is worth it is always going to be an problem.. I guess for all that evolution with nothing ever going extinct, we will each end up a very individual fossil. Maja -- View this message in context: http://www.nabble.com/ridiculous-behaviour-printing-to-eps%3A-labels-all-messed-up%21-tp23916638p23932656.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ridiculous behaviour printing to eps: labels all messed up!
Solution!! Peter, that seems to do the trick! dev.copy2eps(file="test.eps", useKerning=FALSE) correctly places the labels without splitting them! the same also works with postscript() of course. I also found another thread where this was solved http://www.nabble.com/postscript-printer-breaking-up-long-strings-td23322197.html here - sorry for duplicating threads. Apparently it is considered a feature of the printer! I don't understand the string width calculation rationale, and I hope it doesn't cause other problems along the way, but it's looking good for now! As for Zeljko... nice one :) of course I thought of it, but I do have more than 26 labels, and furthermore this just annoyed me so much, I had to figure it out! ipak hvala! Thanks guys! Maja -- View this message in context: http://www.nabble.com/ridiculous-behaviour-printing-to-eps%3A-labels-all-messed-up%21-tp23916638p23922203.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ridiculous behaviour printing to eps: labels all messed up!
OK, this is really weird! here's an example code: t1<-c(1,2,3,4) t2<-c(4,2,4,2) plot(t1~t2, xlab="exp1", ylab="exp2") dev.copy2eps(file="test.eps") that all seems fine... until you look at the eps file created, where for some weird reason, if you scroll down to the end, the code reads: /Font1 findfont 12 s 0 setgray 214.02 18.72 (e) 0 ta -0.360 (xp1) tb gr 12.96 206.44 (e) 90 ta -0.360 (xp2) tb gr Which means, that the labels "exp1" and "exp2" get split up!?!? Now visually that doesn't matter, but I use the labels to refer to them in LaTeX using psfrag, so I have to know exactly what they are called in the .eps file in order to reference them correctly. I've tried other labels and the splitting up seems completely random i.e doesn't have anything to do with the length of the label etc. I am completely lost here, can someone help me figure out what is going on here? Maja -- View this message in context: http://www.nabble.com/ridiculous-behaviour-printing-to-eps%3A-labels-all-messed-up%21-tp23916638p23916638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stars (as fourfold plots) in plot (symbols don't work)
Hi! I have a dataset with three columns -the first two refer to x and y coordinates, the last one are odds ratios. I'd like to plot the data with x and y coordinates and the odds ratio shown as a fourfold plot, which I prefer to do using the stars function. Unfortunately the stars option in symbols is not as cool as the stars function on its own, and now i can't figure out how to do it! here's an example code: #data test<-cbind(c(1,2,3,4), c(1,2,3,4), c(2,4,8,16)) #this is what I want the star symbol to look like stars(cbind(1,sqrt(test[1,3]), 1, sqrt(test[1,3])), col.segments=c("gray90", "gray"),draw.segments=TRUE, scale=FALSE) #this is what happens when using stars in symbols symbols(test[,1], test[,2], stars=cbind(1,sqrt(test[,3]), 1, sqrt(test[,3]))) Can anyone set me on the right path please? Maja -- View this message in context: http://www.nabble.com/stars-%28as-fourfold-plots%29-in-plot-%28symbols-don%27t-work%29-tp23905987p23905987.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] indicator or deviation contrasts in log-linear modelling
I realise that in the case of loglin the parameters are clacluated post festum from the cell frequencies, however other programmes that use Newton-Raphson as opposed to IPF work the other way round, right? In which case one would expect the output of parameters to be limited to the particular contrast used. But since loglin uses IPF I would have thought the choice of style of parameter to be output could be made... Anyway, this is the line that interests me: > lm( as.vector( loglin(...,fit=TRUE)$fit ) ~ < your favored contrasts > ) only I'm not profficient in R to figure out the last term :( How would I go about this then if my prefered contrasti is setting the first categories as reference cats? I literaly just need the equivalent of loglin(matrix(c(1,2,3,4), nrow=2), list(c(1,2)), param=TRUE) which would give me parameters under indicator contrast. glm... well, I'd have to work on it Regarding the more general points ad 2) I would have thought that direct inspection of cell frequencies is precisely the wrong/misleading thing to do - the highest order coefficients can be inspected directly in order to see the interaction without the (lower) marginal effects, or alternatively the table can be standardized to uniform margins for the same sort of inspection. ad 3) and yes, I figured as much! I can't see how lower order terms can be interpreted at all if higher order interactions exist? I've seen it done, e.g I've seen it claimed that in a standardized table the lower order terms are all equal to zero, which is of course not true? Thanks! Maja -- View this message in context: http://www.nabble.com/indicator-or-deviation-contrasts-in-log-linear-modelling-tp22090104p22093070.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] indicator or deviation contrasts in log-linear modelling
I am fairly new to log-linear modelling, so as opposed to trying to fit modells, I am still trying to figure out how it actually works - hence I am looking at the interpretation of parameters. Now it seems most people skip this part and go directly to measuring model fit, so I am finding very few references to actual parameters, and am of course clear on the fact that their choice is irelevant for the actual model fit. But here is my question: loglin uses deviation contrasts, so the coefficients in each term add up to zero. Another option are indicator contrasts, where a reference category is chosen in each term and set to zero, while the others are relative to it. My question is if there is a log-linear command equivalent to loglin that uses this secong "dummy coding" style of constraints (I know e.g. spss genlog does this). I hope this is not to basic a question! And if anyone is up for answeing the wider question of why log-linear parameters are not something to be looked at - which might just be my impression of the literature - feel free to comment! Thanks for your help! Maja -- View this message in context: http://www.nabble.com/indicator-or-deviation-contrasts-in-log-linear-modelling-tp22090104p22090104.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] disaggregate frequency table into flat file
Marc, it's the second "expansion" type transformation I was after, although your expand.dft looks quite complicated? here's what I finaly came up with - the bold lines correspond to what expand.dft does? > orig<-matrix(c(40,5,30,25), c(2,2)) > orig [,1] [,2] [1,] 40 30 [2,]5 25 > flat<-as.data.frame.table(orig) > ind<-rep(1:nrow(flat), times=flat$Freq) > flat<-flat[ind,-3] > sample<-matrix(table(flat[sample(1:length(ind),10),]), c(2,2)) > sample [,1] [,2] [1,]42 [2,]13 So i get from the orig matrix to the sample matrix, expanding and contracting it back in between! It's just that I was hoping there was a more direct way of doing it! Thanks! maja -- View this message in context: http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17405966.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] disaggregate frequency table into flat file
sorry, my mistake! the data frame should read: orig<-as.data.frame.table(orig) orig Var1 Var2 Freq 1AA 40 2BA5 3AB 30 4BB 25 but basicaly i would simply like a sample of the original matrix ( which is a frequency table/contingency table/crosstabulation) hope this is clearer now! maja jholtman wrote: > > Not exactly clear what you are asking for. Your data.frame.table does not > seem related to the original 'orig'. What exactly are you expecting as > output? > > On Wed, May 21, 2008 at 10:16 PM, maiya <[EMAIL PROTECTED]> wrote: > >> >> i appologise for the trivialness of this post - but i've been searching >> the >> forum wothout luck - probably simply because it's late and my brain is >> starting to go.. >> >> i have a frequency table as a matrix: >> >> orig<-matrix(c(40,5,30,25), c(2,2)) >> orig >> [,1] [,2] >> [1,] 40 30 >> [2,]5 25 >> >> i basically need a random sample say 10 from 100: >> >> [,1] [,2] >> [1,] 5 2 >> [2,]0 3 >> >> i got as far as >> >> orig<-as.data.frame.table(orig) >> orig >> Var1 Var2 Freq >> 1AA 10 >> 2BA5 >> 3AB 30 >> 4BB 25 >> >> and then perhaps >> >> individ<-rep(1:4, times=orig$Freq) >> >> which gives a vector of the 100 individuals in each of the 4 groups - >> cells, >> but I'm >> (a) stuck here and >> (b) afraid this is a very round-about way at getting to what I want i.e. >> I >> can now sample(individ, 10), but then I'll have a heck of a time getting >> the >> result back into the original matrix form >> >> sorry again, just please tell me the simple solution that I've missed? >> >> thanks! >> >> maja >> >> -- >> View this message in context: >> http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17396040.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17403687.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] disaggregate frequency table into flat file
i appologise for the trivialness of this post - but i've been searching the forum wothout luck - probably simply because it's late and my brain is starting to go.. i have a frequency table as a matrix: orig<-matrix(c(40,5,30,25), c(2,2)) orig [,1] [,2] [1,] 40 30 [2,]5 25 i basically need a random sample say 10 from 100: [,1] [,2] [1,] 5 2 [2,]0 3 i got as far as orig<-as.data.frame.table(orig) orig Var1 Var2 Freq 1AA 10 2BA5 3AB 30 4BB 25 and then perhaps individ<-rep(1:4, times=orig$Freq) which gives a vector of the 100 individuals in each of the 4 groups - cells, but I'm (a) stuck here and (b) afraid this is a very round-about way at getting to what I want i.e. I can now sample(individ, 10), but then I'll have a heck of a time getting the result back into the original matrix form sorry again, just please tell me the simple solution that I've missed? thanks! maja -- View this message in context: http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17396040.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] axis and tick widths decoupled (especially in rugs!)
Hi! (a complete newby, but will not give up easily!) I was wondering if there is any way to decouple the axis and tick mark widths? As I understand they are both controlled by the lwd setting, and cannot be controlled independently? For example I might want to create major and minor ticks, which I now know how to do by superimposing two axes with different at settings, but what if I also wanted the major ticks to be thicker? or a different colour? You might find this nitpicking, but I am particularly concerned about rug(), which passes to axis(), in that I cannot get a decent thick-lined rug, without the horizontal line also becoming equally thick. Is there any way to do this without having to resort to segments? Tnx! -- View this message in context: http://www.nabble.com/axis-and-tick-widths-decoupled-%28especially-in-rugs%21%29-tp17068508p17068508.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.