from:"Federico Calboli"

[R] data formatting: from rows to columns

2007-08-28 Thread Federico Calboli

Hi All,

I have some data I need to write as a file from R to use in a different 
program. 
My data comes as a numeric matrix of n rows and 2 colums, I need to transform 
each row as a two rows 1 col output, and separate the output of each row with a 
blanck line.

Foe instance I need to go from this:

  V2  V3
27  2032567  19
28  2035482  19
126 2472826  19
132 2473320  19
136 2035480 135
145 2062458 135
148 2074927 135
151 2102395 142
156 2027252 142
158 2473082 142

to

2032567
19

2035482
19

2472826
19

2473320
19

2035480
135

...

Any hint? I seem a bit stuck. cat(unlist(data), file ='data.txt', sep = '\n') 
(obviously) does not work...

Cheers,

Fede






-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rational or float?

2007-08-08 Thread Federico Calboli

Hi All,

I am using the function fractions() and cognates from MASS. I would  
like to be able to tell if some calculations I am doing on some  
rationals are transformed into floats and then retransformed into  
rationals.

For instance I suspect that

as.fractions(1/8) + as.fractions(1/4)

does transform into floats and back, while I know

1/as.fractions(8) + 1/as.fractions(4)

does not. Since I am using sum() and doing a number of  
multiplications I would like to know so I can intervene.

For instance, what's happening here?

 > pun
  [,1] [,2]
[1,]11
[2,]44
 > library(MASS)
 > pun[1,]/as.fractions(pun[2,])
[1] 1/4 1/4
 > sum(pun[1,]/as.fractions(pun[2,]))
[1] 1/2

Best,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] EM unsupervised clustering

2007-07-18 Thread Federico Calboli

Dimitris Rizopoulos wrote:
> you could also have a look at function lca() from package `e1071' that 
> performs a latent class analysis, e.g.,
> 
> fit1 <- lca(data, 2)

I tried but I got:

 > lca(data, 2)
Error in matrix(0, 2^nvar, nvar) : matrix: invalid 'nrow' value (too large or 
NA)
In addition: Warning message:
NAs introduced by coercion in: matrix(0, 2^nvar, nvar)

and dim(mat) is 110 and 109.

I am puzzled.

Cheers,

Fede


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] EM unsupervised clustering

2007-07-18 Thread Federico Calboli

Hi All,

I have a  n x m matrix. The n rows are individuals, the m columns are variables.

The matrix is in itself a collection of 1s (if a variable is observed for an 
individual), and 0s (is there is no observation).

Something like:

  [,1] [,2] [,3] [,4] [,5] [,6]
[1,]101100
[2,]101100
[3,]101100
[4,]010000
[5,]101100
[6,]010010


I use kmeans to find 2 or 3 clusters in this matrix

k2 = kmeans(data, 2, 1000)
k3 = kmeans(data, 3, 1000)

but I would like to use something a bit more refined, so I though about a EM 
based clustering. I am using the Mclust() function from the mclust package, but 
I get the following (to me incomprehensible) error message:

plot(Mclust(as.data.frame(data)), as.data.frame(data))
Hit  to see next plot:
Hit  to see next plot:
Hit  to see next plot:
Error in 1:L : NA/NaN argument
In addition: Warning messages:
1: best model occurs at the min or max # of components considered in: 
summary.mclustBIC(Bic, data, G = G, modelNames = modelNames)
2: optimal number of clusters occurs at min choice in: 
Mclust(as.data.frame(anc.st.mat))
3: insufficient input for specified plot in: coordProj(data = data, parameters 
= 
x$parameters, z = x$z, what = "classification",

That's puzzling because the example given by ?Mclust is something like

plot(Mclust(iris[,-5]), iris[,-5])

which is pretty simple and dumbproof and works flawlessly...

best,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fisher information matrix

2007-06-26 Thread Federico Calboli

Hi All,

a colleague wants to calculate the Fisher information matrix for a model he 
wrote (not in R). He can easily get the neg-log-likelihood and the best fit 
parameters at the minimum. He can also get negLLs for other parameter values 
too.

Given these data, is there a way in R to calculate the Fisher information 
matrix?

Best,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fractional calculations

2007-06-25 Thread Federico Calboli

Henrique Dallazuanna wrote:
> require(MASS)
> ?as.fractions
> as.fractions(1/2+1/8)

I thought that as.fractions did transform the fractions *first* into floats and 
*then* found the rational approssimation (a passage I'd rather avoid):

fractionspackage:MASSR Documentation

Rational Approximation

Description:

  Find rational approximations to the components of a real numeric
  object using a standard continued fraction method.

/F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fractional calculations

2007-06-25 Thread Federico Calboli

Hi All,

is there a function in R that allows me to work with fractions without 
transforming them to floats (or whatever) in between?

Something that would calculate something like:

(1/2 + 1/8) * 1/2 = 5/16

without ever transforming to 0.5 and 0.125?

Best,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] working with fractions

2007-06-21 Thread Federico Calboli

Muhammad Subianto wrote:

>  > library(MASS)
>  > as.fractions(c(0, 0.15, 0.827, .06, 0.266))
> [1] 0  3/20 62/75  1/15  4/15

Seems to make things a bit too slow, even though I get a good increase in 
precision.

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] working with fractions

2007-06-21 Thread Federico Calboli

Hi All,

I am writing a fucntion where I would like to use fractions for all the 
(numerous) passages.

Is there a way of creating an environment *within a fucntion* so that all the 
numbers/calculations are fractions?

Best,

Fede


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] non permanent change of vector values

2007-06-20 Thread Federico Calboli

Hi All,

I have the following problem: I have a vector

x = rep(0,15)
x[1:2] = 1
x
  [1] 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

I need to be able to call that vector 'x' so that if condition 'A' is true, 
only 
the first value is kept 'as is' and all the others are put to 0

if(A == T)

function(x) with x returning 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0

and if 'A' is false the second value is kept 'as is' and all the others are put 
to 0

if(A == F)

function(x) with x returning 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0

BUT, and that's the rub, I need x to changed in a *non permanent* way, so that 
at the end x is still

x
  [1] 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0

(that is because condition 'A' might be called again and could be different in 
it's T/F state from previous calls).

Any ideas?

Cheers,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ESS function highlighting

2007-04-18 Thread Federico Calboli

Hi,

is there a way of telling Emacs + ESS to show words that are already  
a function in R (such as 'length') is a different colour/font?

Best,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting symbol

2007-03-23 Thread Federico Calboli

Hi All,

can I have a plot where the symbol for the dots is smaller than pch  
=20 but bigger than pch = '.'?

Best,

Fede

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Federico Calboli

Charles C. Berry wrote:

> 
> 
> This is a bit different than your original post (where it appeared that 
> you were manipulating one row of a matrix at a time), but the issue is 
> the same.
> 
> As suggested in my earlier email this looks like a caching issue, and 
> this is not peculiar to R.
> 
> Viz.
> 
> "Most modern CPUs are so fast that for most program workloads the 
> locality of reference of memory accesses, and the efficiency of the 
> caching and memory transfer between different levels of the hierarchy, 
> is the practical limitation on processing speed. As a result, the CPU 
> spends much of its time idling, waiting for memory I/O to complete."
> 
> (from http://en.wikipedia.org/wiki/Memory_hierarchy)
> 
> 
> The computation you have is challenging to your cache, and the effect of 
> dropping unused columns of your 'data' object by assiging the columns 
> used  to 'srow' and 'drow' has lightened the load.
> 
> If you do not know why SAXPY and friends are written as they are, a 
> little bit of study will be rewarded by a much better understanding of 
> these issues. I think Golub and Van Loan's 'Matrix Computations' touches 
> on this (but I do not have my copy close to hand to check).

Thanks for the clarifications. My bottom line is, I prefer dummy variables 
because they allow me to write cleaner code, with a shorter line for the same 
instruction i.e. less chances of creeping errors (+ turning into -, etc).

I've been challenged that that's memory inefficent, and I wanted to have the 
opinion of people with more experience than mine on the matter.

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory management uestion [Broadcast]

2007-02-20 Thread Federico Calboli

Liaw, Andy wrote:
> I don't see why making copies of the columns you need inside the loop is
> "better" memory management.  If the data are in a matrix, accessing
> elements is quite fast.  If you're worrying about speed of that, do what
> Charles suggest: work with the transpose so that you are accessing
> elements in the same column in each iteration of the loop.

As I said, this is pretty academic, I am not looking for how to do something 
differetly.

Having said that, let me present this code:

for(i in gp){
new[i,1] = ifelse(srow[i]>0, new[srow[i],zippo[i]], sav[i])
new[i,2] = ifelse(drow[i]>0, new[drow[i],zappo[i]], sav[i])
  }

where gp is large vector and srow and drow are the dummy variables for:

srow = data[,2]
drow = data[,4]

If instead of the dummy variable I access the array directly (and its' a 60 
x 6 array) the loop takes 2/3 days --not sure here, I killed it after 48 hours.

If I use dummy variables the code runs in 10 minutes-ish.

Comments?

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory management uestion

2007-02-19 Thread Federico Calboli

Charles C. Berry wrote:

> Whoa! You are accessing one ROW at a time.
> 
> Either way this will tangle up your cache if you have many rows and 
> columns in your orignal data.
> 
> You might do better to do
> 
> Y <- t( X ) ### use '<-' !
> 
> for (i in whatever ){
> do something using Y[ , i ]
> }

My question is NOT how to write the fastest code, it is whether dummy variables 
(for lack of better words) make the memory management better, i.e. faster, or 
not.

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory management uestion

2007-02-19 Thread Federico Calboli

Hi All,

I would like to ask the following.

I have an array of data in an objetct, let's say X.

I need to use a for loop on the elements of one or more columns of X and I am 
having a debate with a colleague about the best memory management.

I believe that if I do:

col1 = X[,1]
col2 = X[,2]
...
colx = X[,x]


and then

for(i in whatever){
do something using col1[i], col2[i] ... colx[i]
}

my memory management is better that doing:

for(i in whatever){
do something using X[i,1], X[i,2] ... X[,x]
}

BTW, here I *have to* use a for() loop an no nifty tapply, lapply and family.

Any comment is welcome.

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spss file import

2007-02-12 Thread Federico Calboli

Ndoye Souleymane wrote:
> Hi,
> 
> Let me suggest you to save your spss file in txt and use the read.table 
> function to load your file in R.
> That is what I use to do.

The problem here is that the files are old data that were made with an ancient 
version of spss and cannot be changed to txt (or anything else) now.

It looks like read.spss() does not import them in whatever .sl3 is.

Best

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] spss file import

2007-02-07 Thread Federico Calboli

Hi All,

does anyone ever import old SPSS files in a sl3 format?

read.spss('file.sl3') does not seem to work... it's not recognised as  
a supported SPSS format at all.

Best,

Fede

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory management

2006-10-30 Thread Federico Calboli

Hi All,

just a quick (?) question while I wait my code runs...

I'm comparing the identity of the lines of a dataframe, doing all possible 
pairwise comparisons. In doing so I use identical(), but that's by the way. I'm 
doing a (not so) quick and dirty check, and subsetting the data as

data[row.numb,]

and

data[a different row,]

I suspect the problem there is that I load into memory the whole frame data[,] 
every time, making the biz quite slow and wasteful. As I'm idly waiting, I 
though, had I put every line of data[,] as the item of a list, then done my 
pairwise comparisons using the list, would I have had a better performance?

(do I win the prize for the most convoluted sentence sent to the R-help?)

For the pedants, yes, I know I could kill the process and try myself, but the 
spirit of the question is, is there a way of dealing with big data 
*efficiently*?

Best,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] pairs matchning

2006-10-26 Thread Federico Calboli

Hi All,

I have two numerical matrices of 2 columns and many rows.

The two coulumns of matrix (1) form a number of 'pairs' of numbers, e.g:

  [,1] [,2]
[1,]10
[2,]34
[3,]34
[4,]58
[5,]10
[6,]10
[7,]67

Matrix (2) contains the *unique* pairs:

  [,1] [,2]
[1,]10
[2,]34
[3,]58
[4,]67


I would like to create a vector matching the pairs in matrix (1) to the unique 
pairs in matrix (2), e.g:

[1] 1 2 2 3 1 1 4

(done by hand)

match() does not seem to be able to handle pairs, and I don't seem to be able 
to 
find an elegant solution...

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] putting stuff into bins...

2006-09-26 Thread Federico Calboli

Hi All,

I have a vector of data, a vector of bin breakpoints and I want to put my data 
in the bins and then extract fanciful informations like the mean value of each 
bin.

I know I can write my own function, but I would have thought that R should have 
somewhere a function that took as arguments something like (data, breaks, what 
to do with the data in the bins). I surey could not find it trawling the R-help 
archives though.

If such a function exists I'd be grateful to anyone pointing it out to me.

Cheers,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] from character to numeric over multiple columns

2006-07-21 Thread Federico Calboli

Prof Brian Ripley wrote:
> Are the columns factors or character?  I'll try to write code that copes 
> with both:
> 
> nm <- unique(c(as.character(col1), as.character(col2), as.character(col3)))
> 
> DF[] <- lapply(DF, function(x) match(x, nm))

Cheers,

it worked.

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] from character to numeric over multiple columns

2006-07-21 Thread Federico Calboli

Hi All,

I have a data frame of three columns, all of which names. The names in the 
three 
  cols overlap up to a point, so I might have *Harry* in all three columns, 
*Tom* in cols 1 and 3 and *Bob* in col 3 only.

harry   paulbob
anita   harry   tom
frank   jackharry
tom peteben


I want to turn the names into numbers, BUT I want the numeric code for, say, 
*Harry*, to be the same on all columns.

Doing

cbind(as.numeric(col1), as.numeric(col2), as.numeric(col3))

does not work because the factors are different in each column, hence they get 
a 
different number even though the name is the same.

Ideas?

Cheers

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rearranging data frame rows

2006-06-23 Thread Federico Calboli

Hi All,

I have two data frames. The first contains data about a number of individuals, 
coded in the first column with a name, in an order I find convenient.

The second contains different data about the same indivduals, in a different 
order. Both data frame have the individual names in the first column.

I need to reorder the second data frame so the rows are rearranged in the same 
manner as the fist. How?

I cannot turn the individual names in a numeric vairable with 
as.numeric(data1[,1]), because the two data frames are subset of different 
data, 
so the the factor levels are way off between the two. I think I need to 
actually 
use the names as a index.

Cheers,

Fede

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] MLE maximum number of parameters

2006-06-19 Thread Federico Calboli

Hi All,

I would like to know, is there a *ballpark* figure for how many  
parameters the minimisation routines can cope with?

I'm asking because I was asked if I knew.

Cheers,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] diagonal matrix: rows rearrangement

2006-06-16 Thread Federico Calboli

Hi All,

I have a non-square matrix of 0s/1s. The rows are individuals, and the colums 
factors. The code is 1 if a factor is present, 0 if not.

I would like to rearrange the order of the rows to find if there are any 
diagonal blocks. Is there a fucntion that would do that already in R?

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] smoothing plot(x, type ='l')

2006-06-07 Thread Federico Calboli

Prof Brian Ripley wrote:
  > It I understand you aright, that is done by par(lend) but the default is
> "round".  So maybe your graphics device (unstated) on your OS (unstated) 
> does not support this.

graphics device = X11 (xserver-xorg)
OS = Debian GNU/Linux, Kernel 2.4.27-2-686-smp

> 
> We need more details, and preferably a simple reproducible example.

s.off.dist
   [1] 27 26 15  7 32 50 31 19  1 11  8  4  4  5 11 28  4 32  5 39 15 32  3  3  
4
  [26]  1  2  2 22  1 23  4  2  8 40 14 42  3  1  4  3  4  4  6  2 29  4  8  5  
9
  [51] 37  2  1 13 13 35  6  9  2  5 31 10  7  1  4 22  3 23  4 10  8 57  1  6  
1
  [76]  1  4 10 16  3 12  3 10  8 10 11 16 19  5  5  9  9  3  4  1  8  3 12  3 
65
[101]  1  7 21  7  2  4 35 15  6  2  6  4  1 14  4 10 24  3  4  3  2  4 11  4  7
[126] 13  7  1  6  2  4 16  3 13 66 10  4  7  2 17  6  4  3  5  6  8  3  3 10 19
[151]  7  3 17  6  6  6  2  9  5  4  4 18  2  3 17 43 22 12  1  3  1  9  3  5  1
[176]  2  5 36 12 23  1  8 10  6  7 19  5 13  2  5  3  9  3  1 12  4  5  3  6  3
[201]  5  1  1 16  6 12  1  5  4  2  2  4  9  9  3 11  7  4  8 14  5 17  3  3 15
[226]  2  4  2 11 13  1 19  7  4  3 20  2  8  5  2  3  4  2  5  5 10  1  9 10  8
[251]  4  4  2  1  3  5  3  1  4  5 13 12  6  5  4  3 10  5  4  1

plot(hist(s.off.dist, breaks = 'fd')$counts ~ hist(s.off.dist, breaks = 
'fd')$m, 
type = 'l')

I want the edges to look round, if at all possible.

Cheers,

Federico



-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] smoothing plot(x, type ='l')

2006-06-07 Thread Federico Calboli

Dimitris Rizopoulos wrote:
> probably you want to use the `lend' argument of ?par(); I hope it helps.

Does not seem to work in my case.

F

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] smoothing plot(x, type ='l')

2006-06-07 Thread Federico Calboli

Hi All,

I am using plot(x, type = 'l') for some plotting, but I would like rounded 
edges 
rather than jagged edges in the plot (purely for aestetic reasons).

How could I achieve that?

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] writing 100 files

2006-05-22 Thread Federico Calboli

Hi All,

I need to write as text files 1000 ish variation of the same data frame, 
once I permute a row.

I would like to use the function write.table() to write the files, and 
use a loop to do it:

for (i in 1:1000){

bb8[2,] = sample(bb8[2,])
write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F, 
file = 'whatever?.txt')
}
so all the files are called whatever1: whatever1000

Any idea?

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] data manipulation docs

2006-05-04 Thread Federico Calboli

Hi All,

Is there some document/manual about data manipulation within R that I 
could use as a reference (obviously, aside the R manuals)?

The reason I am asking is that I have a number of data frames/matrices 
containg genetic data. The data is in a character form, as in:

   V1 V2 V3 V4 V5
1 AA AG AA GG AG
2 AC AA AA GG AG
3 AA AG AA GG AG
4 AA AA AA GG AG
5 AA AA AA GG AA

I need, to chop, subset, and variously manipulate this kind of data, 
sometimes keeping the data in its character format, sometimes converting 
it to numeric form (i.e. substitute each data point with the equivalent 
factor value). Since the data is ofthe quite big, I have to keep things 
memory efficient.

This whole game is getting excedingly time consuming and frustrating, 
because I end up with random pieces of code that I save, patching a 
particular problem, but difficult to be 'abstracted' for a new task, so 
I get back close to square one annoyingly often.

Cheers,

Federico Calboli


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] forcing apply() to return data frame

2006-04-21 Thread Federico Calboli

Hi All,

I am (almost) successfully using apply() to apply a function recursively 
on a data matrix. The function is question is as.genotype() from the 
library 'genetics'

apply(subset(chr1, names$breed == 'lab'),2,as.genotype,sep ="")

Unfortuantely apply puts it's results into a matrix object rather than a 
data frame, tranforming my factors into numerics and making the results 
useless.

Is there a way of forcing apply() to return a data frame rather than a 
matrix?

Cheers,

Federico


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] speeding up a recursive function

2006-04-02 Thread Federico Calboli

Hi All,

is there any general advice about speeding up recursive functions  
(not mentioning 'don't use them')?

Regards,

Federico Calboli

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] as.matrix and one row

2006-03-28 Thread Federico Calboli

Hi All,

I have the following problem:

x = c(1,2)
x
[1] 1 2

as.matrix(x)
 [,1]
[1,]1
[2,]2

BUT, if I add:

y = c(3,4)

as.matrix(rbind(x,y))
  [,1] [,2]
x12
y34

It does not transpose. Since I will need as.matrix() for a list of data
that is in one or more lines, I need as.matrix to behave in a consisten
fashions, so I get 

as.matrix(x, whatever)
  [,1] [,2]
x12

and 

as.matrix(rbind(x,y), whatever)
  [,1] [,2]
x12
y34

I tried byrow =T, does not make a thing.

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] listing nodes in paths

2006-03-20 Thread Federico Calboli

Hi All,

I found a solution for my question:

> I have the following adjacency matrix for a directed graph:
>
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
> [1,]00000000
> [2,]00000000
> [3,]10000000
> [4,]00100000
> [5,]00100000
> [6,]11000000
> [7,]00011000
> [8,]00000110
>
> My interest is the numberof path between (8) and (1). Using a  
> standard matrix moltiplication I can see I have one pathe of length  
> 2 and two paths of length 4.
>
> Is there already a function in R (whatever the library) that will  
> list the nodes touched in all those three paths (i.e. 8 -> 6 -> 1;  
> 8 -> 7 -> 4 -> 3 -> 1; 8 -> 7 -> 5 -> 3 -> 1)?

The function is an elaboration of 'findPath' in library 'ggm'. The  
original code was written in Python by Guido van Rossum and  
translated into R by Giovanni Marchetti... I translated the Python  
code for the list_all_paths in the same page:

http://www.python.org/doc/essays/graphs.html

The function is:

"paths" <- function (amat, st, en, path = c()){
   indices = 1:nrow(amat)
   if(st == en) # st is 'node' in recursive calls
 return(c(path, st))
   if(sum(amat[st,]) == 0 )
 return(NULL)
   paths = c()
   ne = indices[amat[st,]==1] # Boundary of x. Assumes that amat is  
symmetric
   for(node in ne){
 if(!is.element(node, c(path, st))){
   newpaths = paths(amat, node, en, c(path, st))
   for(newpath in newpaths){
 paths = c(paths,newpath)
   }
 }
   }
   return(paths)
}

and if I do:

 >paths(ad, 8, 1)
[1] 8 6 1 8 7 4 3 1 8 7 5 3 1

Which I can then slice to my heart content.

Best,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] listing nodes in paths

2006-03-18 Thread Federico Calboli

Hi All,

I have the following adjacency matrix for a directed graph:

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]00000000
[2,]00000000
[3,]10000000
[4,]00100000
[5,]00100000
[6,]11000000
[7,]00011000
[8,]00000110

My interest is the numberof path between (8) and (1). Using a  
standard matrix moltiplication I can see I have one pathe of length 2  
and two paths of length 4.

(paths of length 2)

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]00000000
[2,]00000000
[3,]00000000
[4,]10000000
[5,]10000000
[6,]00000000
[7,]00200000
[8,]11011000

(paths of length 4)

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]00000000
[2,]00000000
[3,]00000000
[4,]00000000
[5,]00000000
[6,]00000000
[7,]00000000
[8,]20000000

All in all I have 3 paths (not all of the same length) between (8)  
and (1).

Is there already a function in R (whatever the library) that will  
list the nodes touched in all those three paths (i.e. 8 -> 6 -> 1; 8 - 
 > 7 -> 4 -> 3 -> 1; 8 -> 7 -> 5 -> 3 -> 1)?

Regards,

Federico Calboli

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] finding warning point in function

2006-03-12 Thread Federico Calboli

Hi everyone,

I would like to find out when and where exactly I get the following
warning in a piece of code I've written:

Error in "[<-"(`*tmp*`, iseq, value = numeric(0)) :
nothing to replace with

The code is a for () loop performing a somewhat trivial calculation,
modulated by a number of logical if(){} else(){} conditions, involving
the creation of a number of vectors that contain the elements that will
be manipulated.

The warning itself makes me suspicious that somewhere along the way a
vector ends up having length 0 for some of the elements of my dataset...
is there a way of finding out where the warning is generated that is
less verbose than printing out every single step?

Cheers,

Federico 


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] 'less' for R?

2006-03-08 Thread Federico Calboli

Hi All,

is there an equivalent of the Unix command 'less' (or 'more'), so I can
look at what's inside a data.frame or a matrix without having it printed
out on console?

I am using R on Debian Linux and Mac OS 10.4.5

Cheers,

F

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] subsetting a list of matrices

2006-02-28 Thread Federico Calboli

Hi All,

I have a list of matrices:

> x
 [,1] [,2]
[1,]14
[2,]25
[3,]36
> y
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   18   21   24   27   30   33
[2,]   19   22   25   28   31   34
[3,]   20   23   26   29   32   35
> z =list(x,y)

I want to create a second list that is has a subset each matrix in the
list subsetting so I get the 2nd and 3rd row of each (and all columns).

How could I do that (apart from looping)?

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] logical condition in vector operation

2006-02-09 Thread Federico Calboli

On Thu, 2006-02-09 at 09:22 +0100, Christoph Buser wrote:
> Dear Frederico
> 
> From your example it is not clear to me what you like to obtain: 
> Please have a look on the slightly changed example here (I
> changed two values to show a potentially undesired side effect of
> your coding.
> 
> 
> test <- data.frame(rbind(c(3,3,10,21,0), c(2,3,11,12,0), c(3,4,12,23,0),
>  c(3,5,13,24,0)))
> names(test) <- c("x","y","p","d","su")
> test
> 
> >>   x y  p  d su
> >> 1 3 3 10 21  0
> >> 2 2 3 11 12  0
> >> 3 3 4 12 23  0
> >> 4 3 5 13 24  0
> 
> j <- 3
> test[test[,1] == j, 5] <- test[test[,1] == j,2] + test[test[,2] == j,1]
> 
> >> > > Warning message:
> >> longer object length
> >>is not a multiple of shorter object length in: 
> >> test[test[, 1] == j, 2] + test[test[, 2] == j, 1] 
> 
> Your code example produces now a warning for the adapted
> data frame "test", since one tries to add two vectors of length 2
> and 3, respectively. The result is based on recycling of the
> smaller vector. In your example there was no warning since the
> second column had only one entry.
> The result with the adapted data frame is:
> 
> test
> >>   x y  p  d su
> >> 1 3 3 10 21  6
> >> 2 2 3 11 12  0
> >> 3 3 4 12 23  6
> >> 4 3 5 13 24  8
> 
> Is that kind of recycling desired in your application. Otherwise
> you should be careful with the coding example above.

Recycling was and is integral part of my plan.

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] logical condition in vector operation

2006-02-08 Thread Federico Calboli

HI All,

I have a data frame such as:

> test
 x y  p  d
[1,] 1 0 10 21 0
[2,] 2 3 11 12 0
[3,] 3 4 12 23 0
[4,] 3 5 13 24 0


and I want to perfor some operations on the first two coulums,
conditional on the uneqaulity values on the 3rd and 4th columns.

For instance:

j = 3
test[test[,1] == j, 5] = test[test[,1] == j,2] + test[test[,2] == j,1]

gives me the result:

test:

 x y  p  d
[1,] 1 0 10 21 0
[2,] 2 3 11 12 0
[3,] 3 4 12 23 6
[4,] 3 5 13 24 7


My probblem is the following: I want to perform the operation
test[test[,1] == j,2] + test[test[,2] == j,1] only if the value of
column p and column d are different at the positions where x or y = j.
In practice, I don't want to perform the first operation because
test[2,4 is 12 and test[1,3] is 12 as well.

I tried an if statement with little success:

if(test[test[,1] == j,3] != test[test[,2] == j,4]){
test[test[,1] == j, 5] = test[test[,1] == j,2] + test[test[,2] == j,1]
}
Warning message:
the condition has length > 1 and only the first element will be used in:
if (test[test[, 1] == j, 3] != test[test[, 2] == j, 4]) {

Could anyone lend some advice?

Cheers,

Federico
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] natural selection coefficient S

2005-11-16 Thread Federico Calboli

Hi everyone,

before I embark in some coding, can I ask if there is any function in
any of the packages on CRAN dealing with the natural selection
coefficient S?

At the moment I am more curious to see it has been implemented already
in R, rather than what specific implementation.

Regards,

Federico Calboli
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] replacing a factor value in a data frame

2005-10-28 Thread Federico Calboli

Hi All,

I have the following problem, that's driving me mad.

I have a dataframe of factors, from a genetic scan of SNPs. I DO have
NAs in the dataframe, which would look like:

   V4 V5 V6 V7   V8   V9 V10
1  TT GG TT AC   AG   AG  TT
2  AT CC TT AA   AA   AA  TT
3  AT CC TT AC   AA   TT
4  TT CC TT AA   AA   AA  TT
5  AT CG TT CC   AA   AA  TT
6  TT CC TT AA   AA   AA  TT
7  AT CC TT CCTT
8  TT CC TT AC   AG   AG  TT
9  AT CC TT CC   AG   TT
10 TT CC TT CC   GG   GG  TT


In the dataframe I have 1 column where one factor has been erroneosly
given alternative readings: CG and GC. 

I want to change the instances of GC to CG and I use the code:

data[data[,30]=="GC", 30] = "CG"

but get the error:
Error in "[<-.data.frame"(`*tmp*`, all[, 30] == "GC", 30
missing values are not allowed in subscripted as

Any hints?

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] partial rank correlation coefficient

2005-10-21 Thread Federico Calboli

Hi All,

a colleague asked me if R has a function producing a Partial Rank
Correlation Coefficient, sensu Blower and Dowlatabadi 1994 [1].

I personally would not have a clue, and I could not find something like
that on the search page... although I would not be surprised if it's
there under a different name.

Incidentally, unfortunately the function, assuming it exists, is to be
fed to some MSc students. They are supposed to learn how to use R, but
are not supposed to know/understand what maths and stats are under the
bonnet of the test (I don't run the course, no comment). So the 'plan'
is to give them a lecture about models in epidemiology and sensitivity,
and then the punch line should be: 'and the PRCC can be asily done in R
using the function ???'.

Cheers,

Federico 

[1] S.M. Blower and H. Dowlatabadi, 1994. SENSITIVITY AND UNCERTAINTY
ANALYSIS OF COMPLEX-MODELS OF DISEASE TRANSMISSION - AN HIV MODEL, AS AN
EXAMPLE. International Statistical Review 62(2) 229-243. 


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] indexing a vector starting from 0

2005-07-24 Thread Federico Calboli

Hi All,

I would like to ask if it possible to start indexing a vector from 0:

x = 1:10

y = c(0,0,3,4,5,6,0,8,9,10)

I need to use y as an index to extract the values of x, BUT I cannot  
cull/transform the 0s. What I would like is to start counting the  
elements of x 0:9 rather than 1:10. Would this be at all possible?

Regards,

Federico Calboli
--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] vectorising ifelse()

2005-07-22 Thread Federico Calboli

On 22 Jul 2005, at 11:20, Adaikalavan Ramasamy wrote:

> Does either 'zippo' or 'zappo' contain the values 1 or 2 ?
>
>
> If so, then you cannot vectorize this code because you are changing  
> the
> values in 'new' at every iteration and potentially sampling a value  
> from
> new[ ,1] or new[ ,2] .
>

That's exactly my situation, and is exactly what I want to do.

After taking out the typo (and bug) "drow[i]>0" the code seems to  
work fast enough... I'll tinker a bit with it, but it could be good  
enough as it is.

Cheers,

Federico Calboli

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] vectorising ifelse()

2005-07-21 Thread Federico Calboli

Hi All,

is there any chance of vectorising the two ifelse() statements in the
following code:

for(i in gp){
   new[i,1] = ifelse(srow[i]>0, new[srow[i],zippo[i]], sample(1:100, 1,
prob =Y1, rep = T))
   new[i,2] = ifelse(drow[i]>0, new[drow[i]>0,zappo[i]], sample(1:100,
1, prob =Y1, rep = T))
 }

Where I am forced to check if the value of drow and srow are >0 for each
line... in practical terms, I am attributing haplotypes to a pedigree,
so I have to give the haplotypes to the parents before I give them to
the offspring. The vectors *zippo* and *zappo* are the chances of
getting one or the other hap from the sire and dam respectively. *gp* is
the vectors of non-ancestral animals. *new* is a two col matrix where
the haps are stored.

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] loop over large dataset

2005-07-04 Thread Federico Calboli

On 4 Jul 2005, at 15:15, Peter Dalgaard wrote:
>
> Your original code got lost in the threading, but that order of
> magnitude suggests that you have N^2/2 behaviour somewhere. The  
> typical
> culprit is code like
>
> x <- numeric(0)
> for (i in 1:N){
>   newx <- <<>>
>   x <- c(x, newx)
> }
>
> in which the extension of x causes the whole thing to be reallocated
> and copied. Same thing with cbind and rbind constructs of course.

I changed my code a bit, and now the runtime is dow to less than a  
minute (from more than 24 hours). I was copying a large dataset many  
times over, when I extracted the columns I need as independet vectors  
runtime dropped like a stone.

Cheers,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] loop over large dataset

2005-07-04 Thread Federico Calboli

On 4 Jul 2005, at 12:41, Uwe Ligges wrote:

> Federico Calboli wrote:
>
>
>> In my absentmindedness I'd forgotten to CC this to the list...  
>> and  BTW, using gc() in the loop increases the runtime...
>>
>
> If the data size increases, you cannot expect linear run time  
> behaviour, e.g. because gc() is called more frequently. And of  
> course, gc() needs some time, hence you get the expected increase  
> in runtime. This answers you other question as well.

Is then internal gc() calls that increase the runtime from 5 minutes  
to more then 24 hours for a 27x increase in data (given that the code  
is exactely the same)?

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] loop over large dataset

2005-07-04 Thread Federico Calboli

In my absentmindedness I'd forgotten to CC this to the list... and  
BTW, using gc() in the loop increases the runtime...


>> My suggestion is that you try to vectorize the computation as much  
>> as you
>> can.
>>
>> From what you've shown, `new' and `ped' need to have the same  
>> number of
>> rows, right?
>>
>> Your `off' function seems to be randomly choosing between columns  
>> 1 and 2
>> from its two input matrices (one row each?).  You may want to do the
>> sampling all at once instead of looping over the rows.  E.g.,
>>
>>
>>
>>> (m <- matrix(1:10, ncol=2))
>>>
>>>
>>  [,1] [,2]
>> [1,]16
>> [2,]27
>> [3,]38
>> [4,]49
>> [5,]5   10
>>
>>
>>> (colSample <- sample(1:2, nrow(m), replace=TRUE))
>>>
>>>
>> [1] 1 1 2 1 1
>>
>>
>>> (x <- m[cbind(1:nrow(m), colSample)])
>>>
>>>
>> [1] 1 2 8 4 5
>>
>> So you might want to do something like (obviously untested):
>>
>> todo <- ped[,3] * ped[,5] != 0  ## indicator of which rows to work on
>> n.todo <- sum(todo)  ## how many are there?
>> sire <- new[ped[todo, 3], ]
>> dam <- new[ped[todo, 5], ]
>> s.gam <- sire[1:nrow(sire), sample(1:2, nrow(sire), replace=TRUE)]
>> d.gam <- dam[1:nrow(dam), sample(1:2, nrow(dam), replace=TRUE)]
>> new[todo, 1:2] <- cbind(s.gam, d.gam)
>>
>>
>
> Improving the efficiency of the code is abviously a plus, but the  
> real thing I am mesmerised by is the sheer increase in runtime...  
> how come not a linear increase with dataset size?
>
> Cheers,
>
> Federico
>
> --
> Federico C. F. Calboli
> Department of Epidemiology and Public Health
> Imperial College, St. Mary's Campus
> Norfolk Place, London W2 1PG
>
> Tel +44 (0)20 75941602   Fax +44 (0)20 75943193
>
> f.calboli [.a.t] imperial.ac.uk
> f.calboli [.a.t] gmail.com
>
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] loop over large dataset

2005-07-01 Thread Federico Calboli

Hi All,

I'd like to ask for a few clarifications. I am doing some calculations
over some biggish datasets. One has ~ 23000 rows, and 6 columns, the
other has ~62 rows and 6 columns.

I am using these datasets to perform a simulation of of haplotype
coalescence over a pedigree (the datestes themselves are pedigree
information). I created a new dataset (same number of rows as the
pedigree dataset, 2 colums) and I use a looping functions to assign
haplotypes according to a standrd biological reprodictive process (i.e.
meiosis, sexual reproduction).

My code is someting like:

  off = function(sire, dam){ # simulation of reproduction, two inds
  sch.toll = round(runif(1, min = 1, max = 2))
  dch.toll = round(runif(1, min = 1, max = 2))
  s.gam = sire[,sch.toll]
  d.gam = dam[,dch.toll]
  offspring = cbind(s.gam,d.gam)
# offspring
}

for (i in 1:dim(new)[1]){
if(ped[i,3] != 0 & ped[i,5] != 0){
zz = off(as.matrix(t(new[ped[i,3],])),as.matrix(t(new[ped[i,5],])))
new[i,1] = zz[1]
new[i,2] = zz[2]
}
}

I am also attribution a generation index to each row with a trivial
calulation:

for(i in atres){
  genz[i] = (genz[ped[i,3]] + genz[ped[i,5]])/2 + 1
  #print(genz[i])
}

My question then. On the 23000 rows dataset the calculations take about
5 minutes. On the 62 rows one I kill the process after ~24 hours,
and the the job is not finished. Why such immense discrepancy in
execution times (the code is the same, the datasets are stored in two
separate .RData files)?

Any light would be appreciated.

Federico

PS I am running R 2.1.0 on Debian Sarge, on a Dual 3 GHz Xeon machine
with 2 gig RAM. The R process uses 99% of the CPU, but hardly any RAM
for what I gather from top.



-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R demos

2005-06-24 Thread Federico Calboli

Hi All,

I am currently preparing some form of slideshow introducing R and its
capabilities for some colleagues. The thing will be about 30 mins, and
I'd like to have some "pretty pictures" and some "amazing facts" (I'm
trying to sell, obviously :)).

Can I ask if it's possible to easily retrieve a gross figure of the
number of functions in R considering the "base" install and all the
libraries available?

Apart from graphics and lattice, are there any more packages producing
eye catching graphics (possibly with a survival analysis/epidemiological
bend)?

Cheers,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] vectorisation suggestion

2005-06-20 Thread Federico Calboli


On 20 Jun 2005, at 21:24, Erin Hodgess wrote:

> Hello, Federico!
>
> I'm a bit confused about your question, please:
>
> What sorts of things are in Vector1, please?

numbers (as in "numeric") that code individuals

>
> Why are you counting NAs in Vector3, please?

I am counting how many times the code for each individual (as listed  
in vector1) is present in vector2, and save the number of counts in  
vector3

Cheers,

Federico

--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] vectorisation suggestion

2005-06-20 Thread Federico Calboli

Hi All,

I am counting the number of occurrences of the terms listed in one  
vector in another vector.

My code runs:

for( i in 1:length(vector3)){
   vector3[i]  = sum(1*is.element(vector2,  vector1[i]))
}

where

vector1 = vector containing the terms whose occurrences I want to count
vector2 = made up of a number of repetitions of all the elements of  
vector1
vector3 = a vector of NAs that is meant to get the result of the  
counting

My problem is that vector1 is about 6 terms, and vector2 is  
62... can anyone suggest a faster code than the one I wrote?

Cheers,

Federico Calboli


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R unable to run on Mac OS 10.4 Tiger

2005-05-24 Thread Federico Calboli

On Tue, 2005-05-24 at 18:05 +0200, Guillaume Chapron wrote:
> Hello,
> I'm running a PB G4 with Mac OS 10.4.1. I have downloaded the latest 
> version R-2.1.0a.dmg. It appears that R does not work. It launches 
> itself, but the window never gets ready, there is written "Loading R..." 
> and a small progress wheel keeps turning indefinitely.

For the record, in exactly the same condition R works for me. Sorry I
cannot be of any help.

F.

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] parsing speed

2005-05-16 Thread Federico Calboli

Hi everyone,

I have a question on parsing speed.

I have two functions:

F1
F2

As things are now, F2 calls F1 internally:

F2 =  function(x){
if (something == 1){
y = F1(x)
}
if (something ==2){
do whatever
}
}

*Assuming there could be some difference*, is is faster to use the code
as written above or should I actually write the statements of F1 to make
the parsing faster? 

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] [R-pkgs] Biodem 0.1/orphaning of MAlmig

2005-04-24 Thread Federico Calboli

On Sun, 2005-04-24 at 12:55 -0500, Dirk Eddelbuettel wrote:
> Actually, most of the time the dependency structure between Debian unstable
> and testing is such that the packages from unstable can be installed
> "straight through" into testing. Look at the apt-get HOWTO for the details on
> pinning which allows you to selectively pull in some package via the comfort
> of apt-get.

I am aware I could pin R, but I am quite conservative on that respect
and I have always preferred waiting a bit more than mixing my
distributions. I doubt it will be more than a few days (and besides, I
would be surprised of any difference in the tarball built by R 2.0.1 and
2.1.0).

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] [R-pkgs] Biodem 0.1/orphaning of MAlmig

2005-04-24 Thread Federico Calboli

Together with Alessio Boattini of the University of Bologna we have
created a package called Biodem. Biodem provides a number of functions
for Biodemographycal analysis, and we hope it will be useful to the
anthropological community.

Because Biodem contains all the functions found in Malmig (a package I
maintain), I would like to orphan it, or, even better, have it removed
from CRAN.

Finally, Biodem has been build on a Debian Linux box and is therefore
not yet available for Windows/OS X; Biodem was built with R 2.0.1
because I am using Debian 'testing' and the new R 2.1.0 is not yet
available for 'testing'. I will upload Biodem 0.2 as soon as I can build
it with R 2.1.

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Rd.sty problems.

2005-04-18 Thread Federico Calboli

Hi All,

I am trying to build a new R package to submit, but it's failing to
create a tex manual:

R CMD check Biodem
* checking for working latex ... OK
* using log directory
'/home/greatsage/Fede/R-packages/temp/Biodem.Rcheck'
* checking for file 'Biodem/DESCRIPTION' ... OK
* checking if this is a source package ... OK

* Installing *source* package 'Biodem' ...
** R
** data
** help
 >>> Building/Updating help pages for package 'Biodem'
 Formats: text html latex example
  Fst   texthtmllatex   example
  Laskertexthtmllatex   example
  Mal.eqtexthtmllatex   example
  N texthtmllatex   example
  P texthtmllatex   example
  Phi   texthtmllatex   example
  S texthtmllatex   example
  col.sto   texthtmllatex   example
  hedrick   texthtmllatex   example
  mal.cond  texthtmllatex   example
  mar.iso   texthtmllatex   example
  mtx.exp   texthtmllatex   example
  r.pairs   texthtmllatex   example
  raw.mig   texthtmllatex   example
  regional.rnd.iso  texthtmllatex   example
  rel.cond  texthtmllatex   example
  rel.phi   texthtmllatex   example
  rri   texthtmllatex   example
  sur.freq  texthtmllatex   example
  sur.inbr  texthtmllatex   example
  surnames  texthtmllatex   example
  sym.P texthtmllatex   example
  uri   texthtmllatex   example
  valleytexthtmllatex   example
* DONE (Biodem)

* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking package dependencies ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for syntax errors ... OK
* checking R files for library.dynam ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking Rd files ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* creating Biodem-Ex.R ... OK
* checking examples ... OK
* creating Biodem-manual.tex ... OK
* checking Biodem-manual.tex ... ERROR
LaTeX errors when creating DVI version.
This typically indicates Rd problems.


I cheked the log file in the .Rcheck dir, but it's incomprehensible to
me, so I simply copied and run the tex file itself. The manual won't
compile lamenting it cannot find the Rd.sty file.

locate Rd.sty gives:
/usr/lib/R/share/texmf/Rd.sty

I am completely lost... any advice? do I have to change some path? I am
using R for Debian from  the latest .deb files for Sarge on x86 arch.

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How to create a vector with "one", "two", "three", ...?

2005-04-15 Thread Federico Calboli

On Fri, 2005-04-15 at 14:30 -0400, Frank Duan wrote:
> Hi R people,
> 
> I met a naive prolem. Could anyone give me a hint how to create such a
> vector with entries: "one", "two", "three", ...?

rvect <- c("one", "two", "three")
rvect
[1] "one"   "two"   "three"


Is it what you want?

F

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] logistic regression weights problem

2005-04-13 Thread Federico Calboli

On Wed, 2005-04-13 at 17:42 +0100, Prof Brian Ripley wrote:
> Use the cbind(yes, no) form of specification.  Note though that the 
> `weights' in a GLM are case weights and not arbitrary downweighting 
> factors and aspects of the output (e.g. AIC, anova) depend on this.  A 
> different implementation of (differently) weighted GLM is svyglm() in 
> package 'survey'.

I tried to use cbind() on a slightly modified dummy set to get get rid
of all warnings and that's what I got:


status <- c(1,1,1,0,0)
SNPs <- matrix( c(1,0,1,0,0,1,0,1,0,1,0,1,0,1,1), ncol =3)
weight <- c(0.2, 0.1, 1, 0.8, 0.7)

 using cbind()

glm(cbind(status, 1-status) ~ SNPs, weights = weight, family = binomial)

Call:  glm(formula = cbind(status, 1 - status) ~ SNPs, family =
binomial,  weights = weight)

Coefficients:
(Intercept)SNPs1SNPs2SNPs3
 -2.079   43.132  -19.487   NA

Degrees of Freedom: 4 Total (i.e. Null);  2 Residual
Null Deviance:  3.867
Residual Deviance: 0.6279   AIC: 6.236

### NOT using cbind()

glm(status~ SNPs, weights = weight, family = binomial)

Call:  glm(formula = status ~ SNPs, family = binomial, weights = weight)

Coefficients:
(Intercept)SNPs1SNPs2SNPs3
 -2.079   42.944  -19.366   NA

Degrees of Freedom: 4 Total (i.e. Null);  2 Residual
Null Deviance:  3.867
Residual Deviance: 0.6279   AIC: 6.236
Warning message:
non-integer #successes in a binomial glm! in: eval(expr, envir, enclos)
##

The anova() call of the cbind() model seems happy:

###
mod <- glm(cbind(status, 1-status) ~ SNPs, weights = weight, family =
binomial)

anova(mod, test = "Chi")
Analysis of Deviance Table

Model: binomial, link: logit

Response: cbind(status, 1 - status)

Terms added sequentially (first to last)


 Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL 4 3.8673
SNPs  2   3.2394 2 0.62790.1980
#

The real data modeldoes not show warnings when I use cbind() but it
still shows warning when I call anova() on the model:

###
anova(glm(cbind(sta, 1-sta) ~ (X1 + X2 + X3 + X4 + X5) * breed, family=
binomial, weights =igf$we, igf),test="Chi")
Analysis of Deviance Table

Model: binomial, link: logit

Response: cbind(sta, 1 - sta)

Terms added sequentially (first to last)


  Df Deviance Resid. Df Resid. Dev P(>|Chi|)
NULL330 372.89
X1 1 0.14   329 372.75  0.71
X2 1 0.26   328 372.49  0.61
X3 1 0.001121   327 372.49  0.97
X4 1 2.63   326 369.86  0.10
X5 1 1.87   325 367.99  0.17
breed  3 5.41   322 362.58  0.14
X1:breed   3 2.98   319 359.60  0.39
X2:breed   3 1.21   316 358.39  0.75
X3:breed   2 1.32   314 357.08  0.52
X4:breed   3 1.75   311 355.33  0.63
X5:breed   2 2.38   309 352.95  0.30
Warning messages:
1: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
2: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
3: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
4: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
5: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
6: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
7: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
8: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
9: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
10: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
######

Why such inconsistency?

Regards,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] logistic regression weights problem

2005-04-13 Thread Federico Calboli

Hi All,

I have a problem with weighted logistic regression. I have a number of
SNPs  and a case/control scenario, but not all genotypes are as
"guaranteed" as others, so I am using weights to downsample the
importance of individuals whose genotype has been heavily "inferred".

My data is quite big, but with a dummy example:

> status <- c(1,1,1,0,0)
> SNPs <- matrix( c(1,0,1,0,0,0,0,1,0,1,0,1,0,1,1), ncol =3)
> weight <- c(0.2, 0.1, 1, 0.8, 0.7)
> glm(status ~ SNPs, weights = weight, family = binomial)

Call:  glm(formula = status ~ SNPs, family = binomial, weights = weight)

Coefficients:
(Intercept)SNPs1SNPs2SNPs3
 -2.079   42.282  -18.964   NA

Degrees of Freedom: 4 Total (i.e. Null);  2 Residual
Null Deviance:  3.867
Residual Deviance: 0.6279   AIC: 6.236
Warning messages:
1: non-integer #successes in a binomial glm! in: eval(expr, envir,
enclos)
2: fitted probabilities numerically 0 or 1 occurred in: glm.fit(x = X, y
= Y, weights = weights, start = start, etastart = etastart,

NB I do not get warning (2) for my data so I'll completely disregard it.

Warning (1) looks suspiciously like a multiplication of my C/C status by
the weights... what exacly is glm doing with the weight vector? 

In any case, how would I go about weighting my individuals in a logistic
regression?

Regards,

Federico Calboli


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R as programming language: references?

2005-04-12 Thread Federico Calboli

On Tue, 2005-04-12 at 17:45 +0800, Feng Chen wrote:
> Maybe you can try this:
> http://cran.r-project.org/doc/manuals/fullrefman.pdf

I was thinking of something that would not limit itself to defining all
possible functions and that I do not have already on my HD...

F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R as programming language: references?

2005-04-12 Thread Federico Calboli

Hi All,

I am looking for references on R as a programming language (apart form
the standard R-lang.pdf and the other manuals), reference that would
cover _in_depth_ things like loops, code optimisation, debugging tools
etc... and is as up-to-date as possible. 

Can anyone suggest any book or other reference apart from the "green
book" and the V&R "S-programming"?

Cheers,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R: function code

2005-04-11 Thread Federico Calboli

On Mon, 2005-04-11 at 12:52 +0200, Clark Allan wrote:
> HI
> 
> sorry to be a nuisance to all!!!
> 
> how can i see the code of a particular function?
> 
> e.g. nnet just as an example

for some function just type the function name  and you get the code:

> ls
function (name, pos = -1, envir = as.environment(pos), all.names =
FALSE,
pattern)
{
if (!missing(name)) {
nameValue <- try(name)
if (identical(class(nameValue), "try-error")) {
name <- substitute(name)
if (!is.character(name))
name <- deparse(name)
pos <- name
}
else pos <- nameValue
}
all.names <- .Internal(ls(envir, all.names))
if (!missing(pattern)) {
if ((ll <- length(grep("[", pattern, fixed = TRUE))) >
0 && ll != length(grep("]", pattern, fixed = TRUE))) {
if (pattern == "[") {
pattern <- "\\["
warning(paste("replaced regular expression pattern",
  sQuote("["), "by", sQuote("[")))
}
else if (length(grep("[^]\\[<-", pattern) > 0)) {
pattern <- sub("\\[<-", "\\[<-", pattern)
warning(paste("replaced", sQuote("[<-"), "by",
  sQuote("[<-"), "in regular expression pattern"))
}
}
grep(pattern, all.names, value = TRUE)
}
else all.names
}



or check out the source code itself.


HTH

F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to estimate Type I, Type III SS

2005-04-06 Thread Federico Calboli

On Wed, 2005-04-06 at 21:40 +0900, Kum-Hoe Hwang wrote:
> Howdy, R gurus
>  I 'd like to know hwo to calculate or estimate SS of Type I and Type III in 
> ANOVA or other anaysis in R.
>  Thanks,

If memory seves me well, try Anova in the car package

F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] Help with three-way anova

2005-04-06 Thread Federico Calboli

On Wed, 2005-04-06 at 09:11 +0100, michael watson (IAH-C) wrote:
> OK, so I tried using lm() instead of aov() and they give similar
> results:
> 
> My.aov <-  aov(IL.4 ~ Infected + Vaccinated + Lesions, data)
> My.lm  <-   lm(IL.4 ~ Infected + Vaccinated + Lesions, data)

Incidentally, if you want interaction terms you need 

lm(IL.4 ~ Infected * Vaccinated * Lesions, data)

for all the possible interactions in the model (BUT you need enough
degrees of freedom from the start to be able to do this).
> 
> If I do summary(My.lm) and summary(My.aov), I get similar results, but
> not identical.
> If I do anova(My.aov) and anova(My.lm) I get identical results.  I guess
> that's to be expected though.
> 
> Regarding the results of summary(My.lm), basically Intercept, Infected
> and Vaccinated are all significant at p<=0.05.  I presume the
> signifcance of the Intercept is that it is significantly different to
> zero?  How do I interpret that?

I guess it's all due to the contrast matrix you used. Check with
contrasts() the term(s) in the datafile you use as independent
variables, and change the contrast matrix as you see fit.

HTH,

F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Help with three-way anova

2005-04-05 Thread Federico Calboli

On Tue, 2005-04-05 at 15:51 +0100, michael watson (IAH-C) wrote:

> So, what I want to know is:
> 
> 1) Given my unbalanced experimental design, is it valid to use aov?

I'd say no. Use lm() instead, save your analysis in an object and then
possibly use drop1() to check the analysis

> 2) Have I used aov() correctly?  If so, how do I get access results for
> interactions?

The use of aov() per se seems fine, but you did not put any interaction
in the model... for that use factor * factor.

HTH,

F

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] cat bailing out in a for loop

2005-04-05 Thread Federico Calboli

Dear All,

I am trying to calculate the Hardy-Weinberg Equilibrium p-value for 42
SNPs. I am using the function HWE.exact from the package "genetics".

In order not to do a lot of coding "by hand", I have a for loop that
goes through each column (each column is one SNP) and gives me the
p.value for HWE.exact. Unfortunately some SNP have reached fixation and
HWE.exact requires a 2 alleles scenario.

So my problem is that my for loop:

##

for (i in 1:42){
xxx<-HWE.exact(genotype(laba.con[,i+3], sep=""))
cat(colnames(laba)[i+3],xxx$p.value,"\n")}

##

bails out as soon as it hits a SNP at fixation for one allele, because
HWE.exact fails. 

I have a lot of this game to play and checking things by hand is not
idea, and I do not care about failed SNP, all I want is for the loop to
carry on regardless of what's in xxx$p.value, even if HWE.exact failed
to calculte it. Dump anything in there! 

Is there any way of forcing the loop to carry on?

Cheers,

Federico Calboli

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] emacs + R?

2005-04-04 Thread Federico Calboli

On Mon, 2005-04-04 at 10:11 +0200, Giorgio Corani wrote:
> Dear All,
> 
> 
> As far I as I have understood reading both your past posting and the
> documentation, in order to have the  command-line completion facility, I
> have to run R within emacs.
> 
> However, as I try to start R within emacs as recommended:
> C-u M-x R
> emacs answers [no match]
> 
> the same if I provide the whole path to  the executable:
> C-u M-x /usr/bin/R[no match]
> 

You need to install the package (library?) ESS for R to work with emacs.
It's really easy if you are using Debian, gust apt-get ess.


F.


-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Error in colMeans ... what's wrong with my data?

2005-04-01 Thread Federico Calboli

On Fri, 2005-04-01 at 20:27 +0200, Hartmut Weinrebe wrote:
> Having searched and searched I still haven't found what's the problem with my 
> data (I've attached the relevant file).
> Every time I tried to use the CANCOR-Function I got error messages.
> So I turned to check my two sets of variables separately by using the 
> "colMeans"-Function. With one set no problem .. but with the other.
> 
> My input :
> 
> > struktur <- read.delim("struktur.csv", header=TRUE, sep = ";")
> >colMeans(struktur)
> 
> And the resulting error:
> 
> Error in colMeans(x, n, prod(dn), na.rm) : 
> `x' must be numeric
> 
Check what str(stuktur) gives you... looks like it seeing your numbers
as something else..

BTW, I do not to have the attached file, seems to have been dropped
before reaching my inbox, so that's all I can suggest.

F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Multi-plot figures with different numbers of plots in different rows

2005-03-26 Thread Federico Calboli

On Sat, 2005-03-26 at 17:44 +, Hess, Stephane wrote:
> Dear all, 
> 
> I have 5 plots that I would like to include in a single figure, spread over 
> two rows. If I use mfrow=c(2,3), and produce my plots one after the other, I 
> will end up with three plots in the first row, and 2 in the second row, which 
> is what I want. However, I would like the two plots in the second row to be 
> moved to the centre of that row, rather than occupying the two left-most 
> cells.
> 
> I have also considered using split.screen, but this would mean that the plots 
> in the lower half would be wider than in the upper half, whereas I want them 
> all to be of the same size.
> 
> Thanks in advance for any suggestions on how this can be done.

There is a book called âGrÃficos EstadÃsticos con Râ under "contributed
documentation" in the main R website. It should prove useful. The book
is in spanish, but as it is a "graphical" manual, it should not matter
much. If in troubles email me privately.

Cheers,

Federico 

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] nested random effects

2005-03-23 Thread Federico Calboli

On Wed, 2005-03-23 at 10:04 -0800, Berton Gunter wrote:
> > 
> > An interaction random effect/fixed effect is noted as 
> > 
> > random ~1|random/fixed
> > 
> > in your case random =~1|ID/FAM (but I don't uderstand why indiviuals
> > withing families are fixed and and families are random, but there you
> > go).
> > 
> 
> 1. Fixed effects cannot be nested within random effects. 
> 
> 2. The "random" specification is backwards: nesting, |g1/g2/g3... , is outer
> to inner and so FAM/ID

The original question had 

Y~ID

so I assumed ID was/is fixed. I have my reservations on that, but who am
I to decide? it' not my data and anyway I have not seen it.

F
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] nested random effects

2005-03-23 Thread Federico Calboli

On Wed, 2005-03-23 at 11:58 -0500, Shaw, Philip (NIH/NIMH) wrote:
> Hi
>  
> I am struggling with nested random effects and hope someone can help.
>  
> 
> 
> I have individuals (ID) who are nested within families (FAM).  I want to
> model an outcome variable, and take account of the intercorrelation of
> individuals within each family. 
>  
> I think this amounts to two random effects, one nested within the other.
>  
> How can I model this in R?
>  
> So far I have tried using the library(nlme), and then
>  
> Y~ID, random=~1|ID*FAM, 
>  

An interaction random effect/fixed effect is noted as 

random ~1|random/fixed

in your case random =~1|ID/FAM (but I don't uderstand why indiviuals
withing families are fixed and and families are random, but there you
go).

Check out Pinheiro and Bates Ch1, especially pg 23 onwards.

Cheers,

F 
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] removing message: [Previously saved workspace restored]

2005-03-08 Thread Federico Calboli

On Tue, 2005-03-08 at 15:57 -0500, Wiener, Matthew wrote:
> Remove the (now empty, because you deleted all objects) file ".RData" from
> the directory.
> 
> Hope this helps,

Thanks, it did fix the problem.

Cheers,

Federico Calboli
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] removing message: [Previously saved workspace restored]

2005-03-08 Thread Federico Calboli

Dear All, 

I saved by mistake the environment I was working in after typing q(),
and now I get the annoying message:

[Previously saved workspace restored]

I have already deleted all the objects in the environment, saving it as
an empty environment, so it's just a matter of nitpicking I suppose. The
message does not appear if I start R from any other place in the
directory tree.

I am reading ?Startup and related docs, but I am utterly failing to
understand how to remove [Previously saved workspace restored] when I
call R from the offending dir...

I am using R on Debian Sarge x86

Cheers,

Federico Calboli
-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] The hidden costs of GPL software?

2004-11-17 Thread Federico Calboli

Philippe Grosjean wrote:
> I would be interested by your impressions and ideas on this topic.

I have found that "user friendly" packages make a lot of assumptions and
take a lot of decisions for the user. This makes things easy, but you do
not really know what is going on, and I'd say this is a hidden cost of
"commercial" software. 

I wrote to the list in February asking how to reproduce some results
previously obtained with Statistica. It turned out that Statistica does
some data manipulation without telling the user, with poor documentation
and no options or choice. Do you trust results obtained this way? I
don't. 

So I'd argue that the lack of a GUI is a good thing, because it forces
the users to think a bit more about what they want to do, and gives more
control on what is going on.

Best,

Federico Calboli
-- 
Federico C. F. Calboli

Dipartimento di Biologia Evoluzionistica Sperimentale
Università di Bologna
Via Selmi, 3
40126 Bologna - ITALY

Tel - +39 051 2094187
Fax - +39 051 2094286
f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] phylogenetic trees calculation

2004-08-01 Thread Federico Calboli

Dear all,

I would like to ask you the following:

I have data about different manuscripts, together with data about the
presence/absence of copying errors, in the days when manuscript were
really manuscripts. I would ideally like to use the data to draw a
phylogenetic tree, so that I can infer which ms was copied from which.
The error presence/absence is coded in binary format. The plan is to use
a maximun parsimony tree approach. The data looks like this toy example:

"ms1"   "ms2"   "ms3"   "ms4"
"err1"  0   1   0   0
"err2"  0   0   1   1
"err3"  1   1   0   0
"err4"  0   0   0   1
"err5"  1   1   1   0
"err6"  1   1   0   1
"err7"  0   1   0   0
"err8"  0   1   0   1

Additionally I have a vector of weights for the errors (and more info to
be possibly used). 

Does R have a set of fuctions of a library that will calculate the a
phylogenetic tree from such data? I installed "ape", but, despite
reading the docs, I cannot find a function that would calculate a tree
from data like mine (my sight may be getting worse though). Any
suggestion is welcome.

Regards,

Federico Calboli


-- 
Federico C. F. Calboli

Dipartimento di Biologia Evoluzionistica Sperimentale
Università di Bologna
Via Selmi, 3
40126 Bologna - ITALY

Tel - +39 051 2094187
Fax - +39 051 2094286
f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] how to upload [forwarded]

2004-07-15 Thread Federico Calboli

[ This was posted to the R-packages list (which I moderate), 
  but definitely doesn't belong there.  Martin Maechler 
]

--- start of forwarded message ---
From: Federico Calboli <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: how to upload
Date: 15 Jul 2004 18:30:26 +0100

Dear All,

I just finished a add on lib called Malmig, to calculate Malecot
migration model and associated functions. I am meant, according to
R-exts, to upload the tar.gz at:

ftp://ftp.ci.tuwien.ac.at/incoming

How? never uploaded anything in my life.

Cheers,
Federico Calboli

=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it
--- end of forwarded message ---

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] plotting a table together with graphs

2004-07-13 Thread Federico Calboli

Dear All,

I would like to ask how to add a table to a "matrix" of graphs.

I have three non linear regression graphs plotted together after:

par(mfrow=c(2,2))

which leaves an empty bottom right corner. I would like to use the space
to add a table (at the moment that's problem number one, adding a "nice"
table will come later). I know it is possible to print tables through
LaTeX and the Design/Hmisc libraries, although I would not have a clue
about printing it together with graphs, but I'd like something "quicker"
if at all possible.

Regards,

Federico Calboli
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Sum of Squares in a lme model

2004-05-22 Thread Federico Calboli

On Sat, 2004-05-22 at 13:08, Renaud Lancelot wrote:

> 
> So you might want to use the anova method for lme objects. You probably 
> need to set the argument type to "marginal", which is not the default value.
> 
> Best,
> 
> Renaud

Using anova(my.lme.model), of any type, does not produce the desired MS
and SS. That's despite the help for anova.lme stating:

" When only one fitted model object is present, a data frame with
the sums of squares, numerator degrees of freedom, denominator
degrees of freedom, F-values, and P-values for Wald tests for the
terms in the model (when 'Terms' and 'L' are 'NULL') [they are by
default]..."

I realise this Sum of Squares thing is silly, and fitting a parallel
"lm" model would give me the required SS and MS, but I simply fail to
see why I have to do this considering that the help for anova.lme states
I can get them somehow.  

I do hope I am not ranting too much.

Regards,

Federico Calboli

-- 

=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Sum of Squares in a lme model

2004-05-22 Thread Federico Calboli

On Sat, 2004-05-22 at 03:58, Deepayan Sarkar wrote:

> 
> Exactly what do you wish to square and sum ? If it's the 'errors' (which 
> in this context is ambiguous), extract them (see ?residuals.lme), 
> square them and sum them. But what are you planning to do with this 
> after you get it ?

Make a "oh so nice" anova table for people who have never seen anything
but an anova table with SS and MS and all the standard trappings (sensu
Sokal and Rohlf or Zar), and would get perturbed if they did not get
what they expected, together with the "oh so nice" stars next to the p
value of the F-test...

I know it is silly, I do not make the rules though. Also, I am probably
just ranting.

Fede
-- 

=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Sum of Squares in a lme model

2004-05-21 Thread Federico Calboli

Dear All,

I would like to ask how to get the Sum of Squares from fitted lme model.
I appreciate that lme maximises the likelihood (or REML) and uses
likelihood ratio tests, but I just fail why I could not get the SS if I
want them. I could use lm to calculate them, but it looks quite
pointless to fit a second (and wrong in a way) model for something so
trivial.

I know that the issue has been dealt with before, but I could not find a
clear cut answer searching through the archives.

Regards 

Federico Calboli

-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Re: Debian & R

2004-05-19 Thread Federico Calboli

I have R 1.9.0 on sid, apt-getting stuff from cran... I too noticed that
cran has R 1.8.1 for Debian, but as I got R 1.9.0 I just ignored the
differences in labelling.

BTW, it is R that tells me it is R 1.9.0 when I fire it up. And has the
"stats" library that I did not remember in R 1.8.*

Federico
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] RE: 1-way anova, nested model

2004-05-19 Thread Federico Calboli

>From what I gather you have 

2 treatments
6 samples nested in treatment
12 replicates in sample.

I do not know what your dependent veriable is, but I would:

code the two treatments as T1 and T2
code the 6 samples as a, b, c, d, e, f so the first three samples
belonging to T1 are not confused with the other three (coding them as
1,2,3 for T1 and 1,2,3 for T2 is a bad idea)
code the replicates as a1, a2, b1, b2

here I am being extra paranoid in the coding, btw, but it helps and
saves grief later on, in my experience.

The model would be:

library(nlme) # needed!
mod<-lme(response ~ treatment, random = ~ 1|sample/replicate, mydata)

An anova test would give only one F test for the fixed factor, with 1
and 4 df (if I am not mistaken, double check!)

Federico
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Mixed-effects model for nested design data

2004-04-30 Thread Federico Calboli

quote:

> I am using nlme for data from nested design.  That is, "tows" are
nested
> within "trip",  "trips" nested within "vessel", and "vessels" nested
> within "season".  I also have several covariates, say "tow_time",
> "latitude" and "depth"
> My model is
>y = season + tow_time + latitude + depth + vessel(season) +
> trip(season, vessel) + e
> In SAS, the program would be
> proc mixed NOCLPRINT NOITPRINT data=obtwl.x;
>   class vessel trip tow season depth;
>   model y = season depth latitude /solution;  <--fixed effects
>   random vessel(season) trip(season vessel);
> run;
> My question is:  How this nested mixed-effects model can be 
> fitted in R-> "nlme"?


> I do not know about SAS but I would guess that your model should be
> fitted
> as something like:
> 
> lme (fixed= y ~ season + tow_time + latitude + depth,
>  random= ~ 1 | season/vessel/trip)
> 
> Maybe you should do some reading in the book by Pinheiro & Bates?
> They explain well how to set up models.



I would create a grouped data variable, to avoid having season a both a
random and fixed effect:

your.data$SV<-getGroups(your.data, form=~1|season/vessel, level=2)

the effect is to create a variable that groups vessels %in% season. BTW,
according to your coding of the data, this stem is not always necessary.

HTH

Federico Calboli
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk
fcalboli at alma.unibo.it

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] anova(my.lme.model, type="marginal") in Pinheiro and Bates

2004-04-24 Thread Federico Calboli

Dear All,

I was recently discussing the use of 

anova(my.lme.model, type="marginal")

with a colleague. I would like to find where the subject is discussed in
Pinheiro and Bates 2000. I like the book because it is clear and
accessible even for people with limited formal training in Math and
Stats like me, but I failed to find the subject at hand in the index. 

I realize that reading the book cover to cover would do me a lot of
good, but I would find it a bit impractical at the moment... can anyone
please point out where in the book the use of marginal anova is
discussed (if at all)? 

Regards,

Federico Calboli


-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] tutorial for lme

2004-04-24 Thread Federico Calboli

> In particular, I have some difficulties to understand how
specifications
> for random effects and variance structures
> could be specified.
>

I had similar problems myself... until a reply from Douglas Bates to one
of my posts, wich clarified me how to specify the random effects. You
should be able to find it at:

http://finzi.psych.upenn.edu/R/Rhelp02/archive/14199.html

(or search the R site using the keywords Calboli and Crawley, should be
the first hit). I am not an expert, but you are welcome to contact me
off list for any further clarifications I can offer (not many, do not
hold your breath).

regards,

Federico Calboli


-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 209 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] problems loading package "nlme" in Debian Sid

2004-04-07 Thread Federico Calboli

Dear All,

I apologize if the following has been already asked, but I could not
find anything in the archives.

I am running Debian Sid (on an Athlon XP), with R 1.8.1.cvs.20040307-1
(it describes itself as "R 1.9.0 experimental" when I fire R up) and I
am unable to load the library "nlme" (I apt-getted r-recommended, so the
library is on the system)

>library(nlme)
Error in loadNamespace((i[[1]],c(lib.loc, .libPath()),keep.source):
package "mva" does not have a namespace

In fact, in my /etc/lib/R/library/mva/ does not have a NAMESPACE file.
How can I fix this? 

I hope the fix is not as simple as upgrading to the latest cvs... but
the machine in question cannot be up online until later today, so I
could not test this out. If that's the case, I apologise for the wasted
bandwidth.

Regards,

Federico Calboli
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] italian keyboard and ~ symbol

2004-03-30 Thread Federico Calboli

On Tue, 2004-03-30 at 12:38, Erich Neuwirth wrote:
> Before using the sequence
>  + Num1 + Num2 + Num6 + 
> one has to make sure that the numeric keyboard is in NumLock mode,
> not in cursor mode.
> If it is in cursor mode, this trick will not work
> (at least it does not on my machine)
> 

It worked! Many thanks!

Regards,

Federico


-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

RE: [R] italian keyboard and ~ symbol

2004-03-30 Thread Federico Calboli

On Tue, 2004-03-30 at 11:48, Henrik Bengtsson wrote:
>  + Num1 + Num2 + Num6 +  where Num1, Num2 &
> Num6 are 1, 2 & 6 on the *numeric keyboard* (not the ones above the
> letter keys) will produce ASCII character 126 ("~", tilde) on my WinXP
> Pro in both Rterm and Rgui for R v1.8.1. Do you get anything at all?
> Does it work in any other Windows applications or dialogs (you
> mentioned Word)? What about  + Num6 + Num5 + 
> (should give "A")?

Rather annoyingly it does not work.
> 
> You could also consider remapping your keyboard, i.e. for some
> physical key replace a "useless/never used" character with "~"
> (tilde). Some initial links can be found at
> http://www.maths.lth.se/help/windows/keyboard/.
> 

It appears it is the only option.

Regards,

Federico
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] italian keyboard and ~ symbol

2004-03-30 Thread Federico Calboli

Dear All,

I would like to ask the following:

a colleague is using R 1.8.1 for Windows XP on his laptop. The keboard
is the standard Italian layout, which is missing the ~ (tilde) key. For
reasons unknown, typing   126, which I understand is the
standard ASCII code, does not produce the desired symbol (any
combination of keys seems to fail...)

Can anyone advice how to produce the ~ symbol, short of a copy/paste
from MS Word?

Regards,

Federico Calboli 


-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] colour scheme in: plot(survfit.model)

2004-03-24 Thread Federico Calboli

Dear All,

I would like to ask a question on the colour scheme I can specify for a 

plot(survfit.model).

I have four lines of Drosophila, kept at 3 different temperatures. About
100 individuals per line and temperature were scored, no censoring.

the data looks like:

linetempday status
line1   18  23  1
.



My model (quite trivial) is like:

surv.model<-survfit(Surv(day, status)~line+temp, mydata)

My problem comes when I:

plot(surv.model,col=1:4)

The plot shows three main groupings, because temperature affects
longevity in drosophila. The problem is that the colour used for "line1"
at temperature "18C" is not the same for "line1" at temperature "28C",
which I find extremely confusing. 

How can I specify to keep the same colour for the same line across the
plot?

regards,

Federico Calboli
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R-business case

2004-03-18 Thread Federico Calboli


> 
> Just to be provocative, it would be best to state the ultimate goals, then
> R users could be of more help.  We have submitted and published articles
> using R and are using R in production work on contracts from
> pharmaceutical companies.  It's difficult to know from the original note
> why we should spend time compiling such data.  Is anyone finding that R
> has some deficiencies with respect to their own work?
> 

All the analysis I do is done with R, so anything published where I am
in the author list is likely to have been done/revised with R.

The fact than R requires me to know what I am doing is more of an
advantage, IMHO, rather than a shortcoming... from a usability
standpoint I never felt I had any problem I could not deal with after
some thinking, or asking the list.

Regards,

Federico Calboli 

-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 4286

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Multidimensional scaling and distance matrices

2004-02-26 Thread Federico Calboli

Dear All,

I am in the somewhat unfortunate position of having to reproduce the
results previously obtained from (non-metric?) MDS on a "kinship" matrix
using Statistica. A kinship matrix measures affinity between groups, and
has its maximum values on the diagonal. 

Apparently, starting with a nxn kinship matrix, all it was needed to do
was to feed it to Statistica flagging that the matrix was NOT a distance
matrix but a kinship one. If Statistica transformed the kinship matrix
into a distance one (how?) is anybody's guess. 

A quick search immediately showed that a multidimensional scaling is
done on a distance matrix. See for instance:
MASS4, pg 304
"Elements of computational statistics", Jentle, pg 122
Edwards and Oman's article, page 2-7 R-News 3/3 

The fact that Statistica happily perform MDS on a "kinship" matrix is
puzzling. Indeed, I would expect errors, as in the following toy
example, without transforming the kinship matrix to distances:

> test
   V1  V2  V3  V4  V5
1 0.198716340 0.003612042 0.011926851 0.019737349 0.015021053
2 0.003612042 0.066742885 0.013809924 0.005121996 0.011175845
3 0.011926851 0.013809924 0.197337389 0.013893087 0.006405424
4 0.019737349 0.005121996 0.013893087 0.216047450 0.006218477
5 0.015021053 0.011175845 0.006405424 0.006218477 0.118812936

cmdscale(test)
   [,1] [,2]
V1  NaN  NaN
V2  NaN  NaN
V3  NaN  NaN
V4  NaN  NaN
V5  NaN  NaN
Warning messages:
1: some of the first 2 eigenvalues are < 0 in: cmdscale(test)
2: NaNs produced in: sqrt(ev)
> isoMDS(test)
Error in isoMDS(test) : NAs/Infs not allowed in d
> sammon(test)
Error in sammon(test) : initial configuration must be complete
In addition: Warning messages:
1: some of the first 2 eigenvalues are < 0 in: cmdscale(d, k)
2: NaNs produced in: sqrt(ev)


The colleagues who used the above routine are unable to tell me with
certainty whether Statistica used metric/non metric scaling, and if non
metric whether a Kruskall or a Sammon scaling. 

In any case, I would simply like to ask the memebers of the list if I am
correct in thinking that MDS can ONLY be performed on a distance matrix,
and I can therefore reasonably expect that some form of transformation
to a distance matrix has been performed by Statistica prior to the MDS.
It would at least be a first step to understand what exactly Statistica
did with the data.

Regards,

Federico Calboli
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 208

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Multidimensional scaling and distance matrices

2004-02-26 Thread Federico Calboli

On Thu, 2004-02-26 at 12:35, Christian Hennig wrote:
> Hi,
> 
> usually the term MDS is used for methods which operate only on
> dissimilarity matrices. A similarity matrix s can be easily transformed
> into a dissimilarity matrix d by taking d <- max(s)-s, which could be
> considered as kind of a canonical standard to do this.
> 
> It seems like the R-MDS methods give errors because your diagonals are
> larger and should be smaller than anything else for dissimilarities.
> 
> I am not familiar with kinship matrices. You may try MDS on
> max(test)-test, but because the diagonals in your matrix are not equal I
> presume that there is another a bit more subtle standard routine to
> convert kinship matrices into dissimilarities, maybe something like  
> (raw, not R) d(i,j)=1-s(i,j)^2/(s(i,i)s(j,j)).
> 
I am happy with the function "dist" in {mva}, and I know there are other
functions in {cluster}, but it's besides the point. The question that is
nagging me is: is it justified to do a form of MDS on a matrix other
than a distance matrix? the reference I pointed out to do say to use a
distance matrix, but do not explicitely say "all else is wrong", so I
could call it a day.

> Did you consider the Statistica manual? It should tell you...

If I had it... in any case I hoped that the people that used Statistica
in the first place did read the manual before going forth in their
analysis. Now we are stuck in a situation where we do not know what
Statistica is actually doing, and I have to convince people that doing
things with R is going to be a better (= more sensible) option. 
 
Regards,

Federico Calboli

-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 208

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] plot(my.procrustes.model) from library {vegan}

2004-02-23 Thread Federico Calboli


> The `plot.procrustes' function really should be more user-friendly and
> flexible. You should contact its author and ask for amendments.

I honestly though my problem was too trival to bother the author in
person, and I thought that getting an answer would leave it in a public
database, as future reference. I hope I did not cause undue
inconvenience.

>  However,
> you have access to the internal results of the Procrustes analysis, and
> you can produce any kind of plots with them. The needed matrices are
> stored as items mod.pro$X and mod.pro$Yrot in your result object
> mod.pro. The  following should do what you asked for:
> 
> plot(mod.pro$X, asp=1, pch=1)
> points(mod.pro$Yrot, pch=2)
> segments(mod.pro$X[,1], mod.pro$X[,2], mod.pro$Yrot[,1],
> mod.pro$Yrot[,2])

The above solves my problem. Thanks for your help.

Best regards,

Federico Calboli


-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 208

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] plot(my.procrustes.model) from library {vegan}

2004-02-23 Thread Federico Calboli

Dear All,

I would like to ask how to customize the graph corresponding to a
procrustes analysis.

I have to distance matrices, that I transform to two set of coordinates
by means of muti dimensional scaling:

library(mva)

c1<-cmdscale(mat.dist1)
c2<-cmdscale(mat.dist2) 

I vant to rotate c2 on c1, and I use the "procrustes" analysis from the
{vegan} library.

library(vegan)

mod.pro<-procrustes(c1,c2)

plot(mod.pro)

My problem is that I need to change the graphical output of
plot(mod.pro). The standard output gives empty circles for the rotated
c2 matrix and arrows pointing to the target c1 matrix. 

I would like to have two set of symbols, one for c1 and the other for
c2, with a line connecting the corresponding points or at worst I would
like to have circles for the point of the target matix and the arrows
pointing to the rotated matrix.

I tried some manipulations based on the standard "plot" function, but I
was quite unsucessful. Can anyone give advice? I am happy to give a toy
example if needed.

Regards,

Federico Calboli 
-- 



=

Federico C. F. Calboli

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 208

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] left eigenvector

2004-02-12 Thread Federico Calboli

Dear All,

how do I compute the left eigenvector of a matrix? I gather that "eigen"
computes the right eigenvectors...

Regards,

Federico Calboli
-- 



=

Federico C. F. Calboli

PLEASE NOTE NEW ADDRESS

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 208

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] logical comparison of two matrices

2004-01-24 Thread Federico Calboli

Dear All,

how can I get a logical comparison between matrices (or vectors) in a if
statement?

Whenever I try I get the following:


> S<-rbind(c(.25,0,0),c(0,.2,0),c(0,0,.1))
> P<-rbind(c(.75,.15,.01),c(.2,.8,.09),c(.05,.05,.9))
>
>
> aa<-function(S,P){
+ if (S == P){
+ return("OK")
+ }
+ else {
+ return("No match")
+ }
+ }
>
>
> aa(S,P)
[1] "No match"
Warning message:
the condition has length > 1 and only the first element will be used in:
if (S == P) {

The warning clearly states that only the first element was used, and
this would not be good enough.

If comparing the whole matrices is not possible I could be happy just
comparing the two diagonals.

regards,

Federico Calboli



-- 



=

Federico C. F. Calboli

PLEASE NOTE NEW ADDRESS

Dipartimento di Biologia
Via Selmi 3
40126 Bologna
Italy

tel (+39) 051 209 4187
fax (+39) 051 251 208

f.calboli at ucl.ac.uk

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

1 2 >

1 - 100 of 129 matches

Mail list logo