Re: [R] identical values not so identical? newbie help please!

2011-03-10 Thread maiya
Quite fascinating, if annoying. Nice example Petr!

Turns out my expected values are causing even more trouble because of this!
I've even gotten negative chi square values (calculated using Cressie and
Read's formula)! 

So instead of kludging the error measurement code, I think I'm going to have
to round the actual expected values. Like

exp <- round(exp, digits=10)

Are there any ethical reservations to doing this? 


--
View this message in context: 
http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346880.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identical values not so identical? newbie help please!

2011-03-10 Thread maiya
Aaah, it truly is wonderful, this technology! 
I guess I'm going to have to override it a bit though..
Along the lines of

tae <- ifesle(all.equal(obs, exp) == TRUE, 0, sum(abs(obs - exp)))

Do I like doing this? No. But short of reading the vast literature that
exists on calculation precision - which would quite possibly result in me
ending up using the same kludge as above - this is as satisfying a solution
as I can hope for!

Thanks  again guys!
Maja


--
View this message in context: 
http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346649.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identical values not so identical? newbie help please!

2011-03-10 Thread maiya
Thanks Josh and Dan!

I did figure it had something to do with the machine epsilon...

But so what do I do now? I'm calculating the total absolute error over
thousands of tables e.g.:
tae<-sum(abs(obs-exp))
Is there any easy way to I keep these ignorable errors from showing up?

And furthermore, why does this happen only sometimes? The two (2D) tables I
attached are actually just one 'layer' in a 3D table. And only 2 out of
about 400 layers had this happen, all the other ones are identical -
perfectly! And out of 2000 3D tables, about 60 of which should have no
error, only 10 actually show an error of zero, and in the rest this same
thing happens in a few layers. 

OK, this is a bit messy for a real question. I mean I can just round down
all the errors that are under 1e-8 or something, but I'd much rather this
not happen in the first place?

Thanks again to the two posters for bothering with me!

Maja. 

--
View this message in context: 
http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346516.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identical values not so identical? newbie help please!

2011-03-10 Thread maiya
Hi there! 
I'm not sure I can create a minimal example of my problem, so I'm linking to
a minimal .RData file that has only two objects: obs and exp, each is a 6x9
matrix.  http://dl.dropbox.com/u/10364753/test.RData link to dropbox file 
(I hope this is acceptable mailing list etiquette!)

Here's what happens:
> obs[1, 1]
[1] 118
> exp[1, 1]
[1] 118
> obs[1, 1]-exp[1, 1]
[1] 2.842171e-14

Problem is, both obs and exp should be identical. They are the result of a
saturated loglinear model, and I've run the same code across about 400
tables, all of which result in sum(obs-exp)=0, except for this one. I can't
figure it out?

Anyway, I need help understanding why 118 and 118 are not really the same. I
appreciate some may be wary of downloading my .Rdata file (I'm on ubuntu if
that's any consolation), but I don't know how else to ask this quesiton!

Thanks!

Maja Z.



--
View this message in context: 
http://r.789695.n4.nabble.com/identical-values-not-so-identical-newbie-help-please-tp3346078p3346078.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting functions of chi square

2010-08-17 Thread maiya

OK, for the record, this is not my homework, thanks for asking!

Also, I am sure I could have phrased my question more eloquently, but (hence
the newbie qualifier) I didn't.  The code I posted was for the plot I want,
only smoothed i.e not based on random sampling from the distribution. 

Dennis: I tried that :) but your code divides the densities by df. I want
the density of X^2/df
Rookie: Same thing as David before - I know how to plot chi squared
densities with different dfs!
David: looks great! It's only the "playing around" that is off-putting...
(sorry again for not explaining well, but illustrate I definitely did!)

Ben & William: Thank you! Jointly you managed to plot exactly what I wanted
and show me why and how so I can do it to more complicated functions!

And just to prove you guys right, here's what I really wanted to plot - but
refrained from mentioning in my original post: how by the central limit
theorem for large df chi^2 approaches normality with a mean of df and
variance of 2*df. 

d2chisq <- function(x,df) {
  dchisq(x*sqrt(2*df)+df,df)*sqrt(2*df)
}

plot(1, type="n",  xlab="", ylab="", xlim=c(-3,3), ylim=c(0,0.5))

for (i in c(5,10,50,100,200,500)){
curve(d2chisq(x,i),add=TRUE)
}
lines(seq(-3,3,.1),dnorm(seq(-3,3,.1),0,1 ), col="red")

Not bad considering I had to look up the chain rule on Wikipedia ;)

Thanks again guys!

maja. 





-- 
View this message in context: 
http://r.789695.n4.nabble.com/plotting-functions-of-chi-square-tp2329020p2329213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting functions of chi square

2010-08-17 Thread maiya

Thanks, but that wasn't what I was going for. Like I said, I know how to do a
simple chi-square density plot with dchisq(). 

What I'm trying to do is chi-square / degrees of freedom. Hence
rchisq(10,i)/i).

How do I do that with dchisq?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/plotting-functions-of-chi-square-tp2329020p2329057.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting functions of chi square

2010-08-17 Thread maiya

Hi! This is going to be a real newbie question, but I can't figure it out. 

I'm trying to plot densities of various functions of chi-square. A simple
chi-square plot I can do with dchisq(). But e.g. chi.sq/degrees of freedom I
only know how to do using density(rchisq()/df). For example:

plot(1, type="n",  xlab="", ylab="", xlim=c(0,2), ylim=c(0,7))

for (i in c(10,50,100,200,500)){
lines(density(rchisq(10,i)/i))
}

But even with 100,000 samples the curves still aren't smooth. Surely there
must be a more elegant way to do this?

Thanks!

Maja
-- 
View this message in context: 
http://r.789695.n4.nabble.com/plotting-functions-of-chi-square-tp2329020p2329020.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function to set log(0)=0 not working on tables or vectors

2010-01-18 Thread maiya

Peter, you're right about the error - I had R commander open and used the
terminal instead - this makes me miss the error messages. Not that I would
have known how to solve it had I seen it :)

And yes, ifelse does work. Not sure I understand what the difference is, but
thanks!

David, I had no idea god could change the laws of mathematics. If that's
true then you have to choose between believing in on or the other? Nice one!

Of course Ben is right, the convention is 0*log(0)=0 but for the purposes of
programming it in an entropy function I only need to define log(0)=0. I
apologize for not being more precise.

Thanks guys!

Maja. 



Ben Bolker wrote:
> 
> David Winsemius  comcast.net> writes:
> 
>> 
>> 
>> On Jan 17, 2010, at 8:17 PM, maiya wrote:
>> 
>> >
>> > There must be a very basic thing I am not getting...
>> >
>> > I'm working with some entropy functions and the convention is to use
>> > log(0)=0.
>> >
>> 
>> I suppose the outcome of that effort may depend on whether you have  
>> assumed the needed godlike capacities to change the laws of  
>> mathematics. But I suppose that as the Earth mother that might occur  
>> to you. Go ahead, define a new mathematics.
> 
>   My guess is that the real intention here is 
> to define 0*log(0) = 0 rather than log(0) = 0 -- 
> really the assertion is that lim(x -> 0) x log(x) = 0,
> which must be true for some reasonable limiting conditions.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://n4.nabble.com/function-to-set-log-0-0-not-working-on-tables-or-vectors-tp1016278p1016724.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function to set log(0)=0 not working on tables or vectors

2010-01-17 Thread maiya

There must be a very basic thing I am not getting...

I'm working with some entropy functions and the convention is to use
log(0)=0.

So I wrote a function:

llog<-function(x){
if (x ==0) 0 else log(x)
}

which seems to work fine for individual numbers e.g.

>llog(0/2)
[1] 0

but if I try whole vectors or tables:

p<-c(4,3,1,0)
q<-c(2,2,2,2)
llog(p/q)

I get this:

[1]  0.6931472  0.4054651 -0.6931472   -Inf


What am I missing?

Thanks!

Maja
-- 
View this message in context: 
http://n4.nabble.com/function-to-set-log-0-0-not-working-on-tables-or-vectors-tp1016278p1016278.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] count number of empty cells in a table/matrix/data.frame

2009-12-03 Thread maiya

Hi everyone!

This is a ridiculously simple problem, I just can't seem to find the
solution!

All I need is something equivalent to 

sum(is.na(x))

but instead of counting missing values, to count empty cells (with a value
of 0).

A naive attempt with is.empty didn't work :)

Thanks!

Maja

Oh, and if the proposed solution would be to make all the empty cells into
missing cells, that is not an option! There are over 20,000,000 cells in my
table, and I don't think my computer is in the mood to store two such
objects!
-- 
View this message in context: 
http://n4.nabble.com/count-number-of-empty-cells-in-a-table-matrix-data-frame-tp947740p947740.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error: cannot allocate vector of size...

2009-11-10 Thread maiya

Cool! Thanks for the sampling and ff tips! I think I've figured it out now
using sampling...

I'm getting a quad-core, 4GB RAM computer next week, will try it again using
a 64 bit version :)

Thanks for your time!!!

Maja 



tlumley wrote:
> 
> On Tue, 10 Nov 2009, maiya wrote:
> 
>>
>> OK, it's the simple math that's confusing me :)
>>
>> So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
>> different?
> 
> Your data are stored on disk as a text file (in CSV format, in fact), not
> as numbers. This can take up less space.
> 
>> And lets say I could potentially live with e.g. 1/3 of the cases - that
>> would make it .8GB, which should be fine? But then my question is if
>> there
>> is any way to sample the rows in read.table? Or what would be the best
>> way
>> of importing a random third of my cases?
> 
> A better solution is probably to read a subset of the columns at a time. 
> The easiest way to do this is probably to read the data into a SQLite
> database with the 'sqldf' package, but another solution is to use the
> colClasses= argument to read.table() and specify "NULL" for the classes of
> the columns you don't want to read. There are other ways as well.
> 
> It might even be faster to do the cross-tabulations in a database and read
> the resulting summaries into R to compute any statistics you need.
> 
>> Thanks!
>>
>> M.
>>
>>
>>
>> jholtman wrote:
>>> 
>>> A little simple math.  You have 3M rows with 100 items on each row.
>>> If read in this would be 300M items.  If numeric, 8 bytes/item, this
>>> is 2.4GB.  Given that you are probably using a 32 bit version of R,
>>> you are probably out of luck.  A rule of thumb is that your largest
>>> object should consume at most 25% of your memory since you will
>>> probably be making copies as part of your processing.
>>> 
>>> Given that, is you want to read in 100 variables at a time, I would
>>> say your limit would be about 500K rows to be reasonable.  So you have
>>> a choice; read in fewer rolls, read in all 3M rows but at 20 columns
>>> per read, put the data in a database and extract what you need.
>>> Unless you go to a 64-bit version of R you will probably not be able
>>> to have the whole file in memory at one time.
>>> 
>>> On Tue, Nov 10, 2009 at 7:10 AM, maiya  wrote:
>>>>
>>>> I'm trying to import a table into R the file is about 700MB. Here's my
>>>> first
>>>> try:
>>>>
>>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>>>
>>>> Error: cannot allocate vector of size 15.6 Mb
>>>> In addition: Warning messages:
>>>> 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>>  :
>>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>>> 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>>  :
>>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>>> 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>>  :
>>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>>> 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>>>  :
>>>>  Reached total allocation of 1535Mb: see help(memory.size)
>>>>
>>>> Then I tried
>>>>
>>>>> memory.limit(size=4095)
>>>>  and got
>>>>
>>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>>> Error: cannot allocate vector of size 11.3 Mb
>>>>
>>>> but no additional errors. Then optimistically to clear up the
>>>> workspace:
>>>>
>>>>> rm()
>>>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>>> Error: cannot allocate vector of size 15.6 Mb
>>>>
>>>> Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb,
>>>> 11.3Mb?
>>>> I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
>>>> memory is usually 2Gb. Surely they mean GB?
>>>>
>>>> The file I'm importing has about 3 million cases with 100 variables
>>>> that
>>>> I
>>>> want to crosstabulate each with each. Is this completely unrealistic?
>>>>
>>>> Thanks!
>>>>
>>>&

Re: [R] Error: cannot allocate vector of size...

2009-11-10 Thread maiya

OK, it's the simple math that's confusing me :)

So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
different?

And lets say I could potentially live with e.g. 1/3 of the cases - that
would make it .8GB, which should be fine? But then my question is if there
is any way to sample the rows in read.table? Or what would be the best way
of importing a random third of my cases?

Thanks!

M.



jholtman wrote:
> 
> A little simple math.  You have 3M rows with 100 items on each row.
> If read in this would be 300M items.  If numeric, 8 bytes/item, this
> is 2.4GB.  Given that you are probably using a 32 bit version of R,
> you are probably out of luck.  A rule of thumb is that your largest
> object should consume at most 25% of your memory since you will
> probably be making copies as part of your processing.
> 
> Given that, is you want to read in 100 variables at a time, I would
> say your limit would be about 500K rows to be reasonable.  So you have
> a choice; read in fewer rolls, read in all 3M rows but at 20 columns
> per read, put the data in a database and extract what you need.
> Unless you go to a 64-bit version of R you will probably not be able
> to have the whole file in memory at one time.
> 
> On Tue, Nov 10, 2009 at 7:10 AM, maiya  wrote:
>>
>> I'm trying to import a table into R the file is about 700MB. Here's my
>> first
>> try:
>>
>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>>
>> Error: cannot allocate vector of size 15.6 Mb
>> In addition: Warning messages:
>> 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>  :
>>  Reached total allocation of 1535Mb: see help(memory.size)
>> 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>  :
>>  Reached total allocation of 1535Mb: see help(memory.size)
>> 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>  :
>>  Reached total allocation of 1535Mb: see help(memory.size)
>> 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
>>  :
>>  Reached total allocation of 1535Mb: see help(memory.size)
>>
>> Then I tried
>>
>>> memory.limit(size=4095)
>>  and got
>>
>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>> Error: cannot allocate vector of size 11.3 Mb
>>
>> but no additional errors. Then optimistically to clear up the workspace:
>>
>>> rm()
>>> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
>> Error: cannot allocate vector of size 15.6 Mb
>>
>> Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
>> I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
>> memory is usually 2Gb. Surely they mean GB?
>>
>> The file I'm importing has about 3 million cases with 100 variables that
>> I
>> want to crosstabulate each with each. Is this completely unrealistic?
>>
>> Thanks!
>>
>> Maja
>> --
>> View this message in context:
>> http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error: cannot allocate vector of size...

2009-11-10 Thread maiya

I'm trying to import a table into R the file is about 700MB. Here's my first
try:

> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)

Error: cannot allocate vector of size 15.6 Mb
In addition: Warning messages:
1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)
2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)
3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)
4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)

Then I tried 

> memory.limit(size=4095)
 and got 

> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
Error: cannot allocate vector of size 11.3 Mb

but no additional errors. Then optimistically to clear up the workspace:

> rm()
> DD<-read.table("01uklicsam-20070301.dat",header=TRUE)
Error: cannot allocate vector of size 15.6 Mb

Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
memory is usually 2Gb. Surely they mean GB?

The file I'm importing has about 3 million cases with 100 variables that I
want to crosstabulate each with each. Is this completely unrealistic?

Thanks!

Maja
-- 
View this message in context: 
http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stars (as fourfold plots) in plot (symbols don't work)

2009-06-08 Thread maiya

I was feeling pretty silly when I saw there was actually a locations
parameter in stars, as well as axes etc. 

But now the problem is that the x and y axes in stars must be on the same
scale! Which unfortunately makes my data occupy only a very narrow band of
the plot. I guess one option would be to scale one set of coordinates and
then manually change the axis labels!? But  I'll have a look at my.symbols
first.

Thanks for the tip!

Maja




Greg Snow-2 wrote:
> 
> Here are 2 useful paths (it's up to you to decide if either is the right
> path).
> 
> The my.symbols function in the TeachingDemos package allows you to create
> your own functions to create the symbols.
> 
> But in this case, you can just use the locations argument to the stars
> function:
> 
>> stars(cbind(1,sqrt(test[,3]), 1, sqrt(test[,3]))/16,
>> locations=test[,1:2],
> + col.segments=c("gray90", "gray"),draw.segments=TRUE, scale=FALSE)
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
> 
> 
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of maiya
>> Sent: Saturday, June 06, 2009 4:03 PM
>> To: r-help@r-project.org
>> Subject: [R] stars (as fourfold plots) in plot (symbols don't work)
>> 
>> 
>> Hi!
>> 
>> I have a dataset with three columns -the first two refer to x and y
>> coordinates, the last one are odds ratios.
>> I'd like to plot the data with x and y coordinates and the odds ratio
>> shown
>> as a fourfold plot, which I prefer to do using the stars function.
>> 
>> Unfortunately the stars option in symbols is not as cool as the stars
>> function on its own, and now i can't figure out how to do it!
>> 
>> here's an example code:
>> #data
>> test<-cbind(c(1,2,3,4), c(1,2,3,4), c(2,4,8,16))
>> #this is what I want the star symbol to look like
>> stars(cbind(1,sqrt(test[1,3]), 1, sqrt(test[1,3])),
>> col.segments=c("gray90", "gray"),draw.segments=TRUE, scale=FALSE)
>> #this is what happens when using stars in symbols
>> symbols(test[,1], test[,2], stars=cbind(1,sqrt(test[,3]), 1,
>> sqrt(test[,3])))
>> 
>> Can anyone set me on the right path please?
>> 
>> Maja
>> --
>> View this message in context: http://www.nabble.com/stars-%28as-
>> fourfold-plots%29-in-plot-%28symbols-don%27t-work%29-
>> tp23905987p23905987.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/stars-%28as-fourfold-plots%29-in-plot-%28symbols-don%27t-work%29-tp23905987p23933876.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ridiculous behaviour printing to eps: labels all messed

2009-06-08 Thread maiya

Wow! Thank you for that Ted, a wonderfully comprehensive explanation and now
everything makes perfect sense!!

Regarding your last point, I would love to hear other people's experience. I
myself, as a complete newbie in both R and LaTeX, am perhaps not the best
judge... But there are several graphics packages that can be used directly
in LaTeX to do what you propose (the Latex Graphics Companion that I own is
about 1000 pages worth of material to help you not be able to make up your
mind..).

I have found postscript the easiest and most intuitive and you can write
postscript graphics directly in Latex using the pstricks package. So yes,
you are right, I could just use the data from R directly (and I hope that
when I become a dinosaur I will be able to create graphs just as beautiful
as yours!).

But there are R plots that I would rather not attempt to code myself, in
particular mosaic plots, so I prefer to import them from R as eps files and
then use psfrag to get the nice LaTeX typesetting for the labels, equations
etc. to make it "fit" visually.

But then with the sheer volume of options figuring out what is the optimal
combination for a particular application, or whether the learning curve is
worth it is always going to be an problem..
I guess for all that evolution with nothing ever going extinct, we will each
end up a very individual fossil.

Maja


-- 
View this message in context: 
http://www.nabble.com/ridiculous-behaviour-printing-to-eps%3A-labels-all-messed-up%21-tp23916638p23932656.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ridiculous behaviour printing to eps: labels all messed up!

2009-06-08 Thread maiya

Solution!!

Peter, that seems to do the trick!

dev.copy2eps(file="test.eps", useKerning=FALSE)

correctly places the labels without splitting them!
the same also works with postscript() of course.

I also found another thread where this was solved 
http://www.nabble.com/postscript-printer-breaking-up-long-strings-td23322197.html
here  - sorry for duplicating threads. Apparently it is considered a feature
of the printer! I don't understand the string width calculation rationale,
and I hope it doesn't cause other problems along the way, but it's looking
good for now!

As for Zeljko... nice one :) of course I thought of it, but I do have more
than 26 labels, and furthermore this just annoyed me so much, I had to
figure it out! ipak hvala!

Thanks guys!

Maja

-- 
View this message in context: 
http://www.nabble.com/ridiculous-behaviour-printing-to-eps%3A-labels-all-messed-up%21-tp23916638p23922203.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ridiculous behaviour printing to eps: labels all messed up!

2009-06-07 Thread maiya

OK, this is really weird!

here's an example code:

t1<-c(1,2,3,4)
t2<-c(4,2,4,2)
plot(t1~t2, xlab="exp1", ylab="exp2")
dev.copy2eps(file="test.eps")

that all seems fine...

until you look at the eps file created, where for some weird reason, if you
scroll down to the end, the code reads:

/Font1 findfont 12 s
0 setgray
214.02 18.72 (e) 0 ta
-0.360 (xp1) tb gr
12.96 206.44 (e) 90 ta
-0.360 (xp2) tb gr

Which means, that the labels "exp1" and "exp2" get split up!?!? 
Now visually that doesn't matter, but I use the labels to refer to them in
LaTeX using psfrag, so I have to know exactly what they are called in the
.eps file in order to reference them correctly. 

I've tried other labels and the splitting up seems completely random i.e
doesn't have anything to do with the length of the label etc. 

I am completely lost here, can someone help me figure out what is going on
here?

Maja

-- 
View this message in context: 
http://www.nabble.com/ridiculous-behaviour-printing-to-eps%3A-labels-all-messed-up%21-tp23916638p23916638.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stars (as fourfold plots) in plot (symbols don't work)

2009-06-06 Thread maiya

Hi!

I have a dataset with three columns -the first two refer to x and y
coordinates, the last one are odds ratios.
I'd like to plot the data with x and y coordinates and the odds ratio shown
as a fourfold plot, which I prefer to do using the stars function. 

Unfortunately the stars option in symbols is not as cool as the stars
function on its own, and now i can't figure out how to do it!

here's an example code:
#data
test<-cbind(c(1,2,3,4), c(1,2,3,4), c(2,4,8,16))
#this is what I want the star symbol to look like
stars(cbind(1,sqrt(test[1,3]), 1, sqrt(test[1,3])),
col.segments=c("gray90", "gray"),draw.segments=TRUE, scale=FALSE)
#this is what happens when using stars in symbols
symbols(test[,1], test[,2], stars=cbind(1,sqrt(test[,3]), 1,
sqrt(test[,3])))

Can anyone set me on the right path please?

Maja
-- 
View this message in context: 
http://www.nabble.com/stars-%28as-fourfold-plots%29-in-plot-%28symbols-don%27t-work%29-tp23905987p23905987.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] indicator or deviation contrasts in log-linear modelling

2009-02-18 Thread maiya

I realise that in the case of loglin the parameters are clacluated post
festum from the cell frequencies,
however other programmes that use Newton-Raphson as opposed to IPF work the
other way round, right?
In which case one would expect the output of parameters to be limited to the
particular contrast used. But since loglin uses IPF I would have thought the
choice of style of parameter to be output could be made...
Anyway, this is the line that interests me:

>   lm( as.vector( loglin(...,fit=TRUE)$fit ) ~ < your favored contrasts > )

only I'm not profficient in R to figure out the last term :(
How would I go about this then if my prefered contrasti is setting the first
categories as reference cats?

I literaly just need the equivalent of

loglin(matrix(c(1,2,3,4), nrow=2), list(c(1,2)), param=TRUE)

which would give me parameters under indicator contrast. glm... well, I'd
have to work on it

Regarding the more general points 

ad 2) I would have thought that direct inspection of cell frequencies is
precisely the wrong/misleading thing to do - the highest order coefficients
can be inspected directly in order to see the interaction without the
(lower) marginal effects, or alternatively the table can be standardized to
uniform margins for the same sort of inspection.

ad 3) and yes, I figured as much! I can't see how lower order terms can be
interpreted at all if higher order interactions exist? I've seen it done,
e.g I've seen it claimed that in a standardized table the lower order terms
are all equal to zero, which is of course not true?

Thanks!
Maja



-- 
View this message in context: 
http://www.nabble.com/indicator-or-deviation-contrasts-in-log-linear-modelling-tp22090104p22093070.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] indicator or deviation contrasts in log-linear modelling

2009-02-18 Thread maiya

I am fairly new to log-linear modelling, so as opposed to trying to fit
modells, I am still trying to figure out how it actually works - hence I am
looking at the interpretation of parameters. Now it seems most people skip
this part and go directly to measuring model fit, so I am finding very few
references to actual parameters, and am of course clear on the fact that
their choice is irelevant for the actual model fit. 

But here is my question: loglin uses deviation contrasts, so the
coefficients in each term add up to zero.
Another option are indicator contrasts, where a reference category is chosen
in each term and set to zero, while the others are relative to it. My
question is if there is a log-linear command equivalent to loglin that uses
this secong "dummy coding" style of constraints (I know e.g. spss genlog
does this). 

I hope this is not to basic a question!

And if anyone is up for answeing the wider question of why log-linear
parameters are not something to be looked at - which might just be my
impression of the literature - feel free to comment!

Thanks for your help!

Maja
-- 
View this message in context: 
http://www.nabble.com/indicator-or-deviation-contrasts-in-log-linear-modelling-tp22090104p22090104.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] disaggregate frequency table into flat file

2008-05-22 Thread maiya

Marc, it's the second "expansion" type transformation I was after, although
your expand.dft looks quite complicated? here's what I finaly came up with -
the bold lines correspond to what expand.dft  does?


> orig<-matrix(c(40,5,30,25), c(2,2))
> orig
 [,1] [,2]
[1,]   40   30
[2,]5   25
> flat<-as.data.frame.table(orig)
> ind<-rep(1:nrow(flat), times=flat$Freq)
> flat<-flat[ind,-3]
> sample<-matrix(table(flat[sample(1:length(ind),10),]), c(2,2))
> sample
 [,1] [,2]
[1,]42
[2,]13

So i get from the orig matrix to the sample matrix, expanding and
contracting it back in between!

It's just that I was hoping there was a more direct way of doing it!
Thanks! 
maja









-- 
View this message in context: 
http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17405966.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] disaggregate frequency table into flat file

2008-05-22 Thread maiya

sorry, my mistake!
the data frame should read:
orig<-as.data.frame.table(orig)
orig
 Var1 Var2 Freq
1AA   40
2BA5
3AB   30
4BB   25

but basicaly i would simply like a sample of the original matrix ( which is
a frequency table/contingency table/crosstabulation)

hope this is clearer now!

maja






jholtman wrote:
> 
> Not exactly clear what you are asking for.  Your data.frame.table does not
> seem related to the original 'orig'.  What exactly are you expecting as
> output?
> 
> On Wed, May 21, 2008 at 10:16 PM, maiya <[EMAIL PROTECTED]> wrote:
> 
>>
>> i appologise for the trivialness of this post - but i've been searching
>> the
>> forum wothout luck - probably simply because it's late and my brain is
>> starting to go..
>>
>> i have a frequency table as a matrix:
>>
>> orig<-matrix(c(40,5,30,25), c(2,2))
>> orig
>> [,1] [,2]
>> [1,]   40   30
>> [2,]5   25
>>
>> i basically need a random sample say 10 from 100:
>>
>> [,1] [,2]
>> [1,]   5   2
>> [2,]0   3
>>
>> i got as far as
>>
>> orig<-as.data.frame.table(orig)
>> orig
>>  Var1 Var2 Freq
>> 1AA   10
>> 2BA5
>> 3AB   30
>> 4BB   25
>>
>> and then perhaps
>>
>> individ<-rep(1:4, times=orig$Freq)
>>
>> which gives a vector of the 100 individuals in each of the 4 groups -
>> cells,
>> but I'm
>> (a) stuck here and
>> (b) afraid this is a very round-about way at getting to what I want i.e.
>> I
>> can now sample(individ, 10), but then I'll have a heck of a time getting
>> the
>> result back into the original matrix form
>>
>> sorry again, just please tell me the simple solution that I've missed?
>>
>> thanks!
>>
>> maja
>>
>> --
>> View this message in context:
>> http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17396040.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem you are trying to solve?
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17403687.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] disaggregate frequency table into flat file

2008-05-22 Thread maiya

i appologise for the trivialness of this post - but i've been searching the
forum wothout luck - probably simply because it's late and my brain is
starting to go..

i have a frequency table as a matrix:

orig<-matrix(c(40,5,30,25), c(2,2))
orig
 [,1] [,2]
[1,]   40   30
[2,]5   25

i basically need a random sample say 10 from 100:

 [,1] [,2]
[1,]   5   2
[2,]0   3

i got as far as 

orig<-as.data.frame.table(orig)
orig
 Var1 Var2 Freq
1AA   10
2BA5
3AB   30
4BB   25

and then perhaps

individ<-rep(1:4, times=orig$Freq)

which gives a vector of the 100 individuals in each of the 4 groups - cells,
but I'm
(a) stuck here and
(b) afraid this is a very round-about way at getting to what I want i.e. I
can now sample(individ, 10), but then I'll have a heck of a time getting the
result back into the original matrix form

sorry again, just please tell me the simple solution that I've missed?

thanks!

maja 

-- 
View this message in context: 
http://www.nabble.com/disaggregate-frequency-table-into-flat-file-tp17396040p17396040.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] axis and tick widths decoupled (especially in rugs!)

2008-05-05 Thread maiya

Hi!

(a complete newby, but will not give up easily!)

I was wondering if there is any way to decouple the axis and tick mark
widths? As I understand they are both controlled by the lwd setting, and
cannot be controlled independently? For example I might want to create major
and minor ticks, which I now know how to do by superimposing two axes with
different at settings, but what if I also wanted the major ticks to be
thicker? or a different colour?
You might find this nitpicking, but I am particularly concerned about rug(),
which passes to axis(), in that I cannot get a decent thick-lined rug,
without the horizontal line also becoming equally thick. 
Is there any way to do this without having to resort to segments?

Tnx!
-- 
View this message in context: 
http://www.nabble.com/axis-and-tick-widths-decoupled-%28especially-in-rugs%21%29-tp17068508p17068508.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.