[R] Estimating bivariate normal density with constrains

2011-10-19 Thread Serguei Kaniovski

Dear R-Users

I would like to estimate a constrained bivariate normal density, the
constraint being that the means are of equal magnitude but of opposite
signs. So I need to estimate four parameters:

mu(meanvector (mu,-mu))
sigma_1 and sigma_2 (two sd deviations)
rho (correlation coefficient)

I have looked at several packages, including Gaussian mixture models in
Mclust, but I am not sure what is the best way, or the best package to use
for this task.

Greatly appreciate any suggestions!

Serguei Kaniovski

Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: serguei.kaniov...@wifo.ac.at
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accessing variables in a data frame

2011-06-27 Thread Serguei Kaniovski
Thanks! I did not realize you can access variables by name like this.

Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing variables in a data frame

2011-06-26 Thread Serguei Kaniovski

Hello

My data.frame (dat) contains many variables named var.names and others
named var.names_var.id

For example

var.name - c(gdp,inf,unp)
var.id - c(w,i)

x - paste(var.name, rep(var.id, each=length(var.name)), sep=_)

How can I access variables in the dama.frame by names listed in x, for
example to compute

gdp_w - gdp_i
inf_w - inf_i
unp_w - unp_i

or

gdp - gdp_w
inf - inf_w
unp - unp_w

without needing to code each difference separately?

Thanks for your help!
Serguei Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Canonical link for the GLM gamma model

2009-08-19 Thread Serguei Kaniovski
Hello!

When I estimate a glm gamma model using the canonical link (inverse), I 
get the opposite signs on almost all coefficients compared to the same 
model (i.e. with the same linear predictor) estimated using other suitable 
links (log, logit).

What confuses me is that most of sources quote the canonical link for the 
gamma model as 1/mu, but some quote -1/mu!

For example: 
http://support.sas.com/rnd/app/da/new/802ce/stat/chap5/sect19.htm

Which is the correct link, and can this explain the sign-flips I get in my 
models?

Thank you very much for your help / advice!
Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help on a combinatorial task (lists?)

2009-08-11 Thread Serguei Kaniovski
Hello!
I have the following combinatorial problem.
Consider the cumulative sums of all permutations of a given weight vector 
'w'. I need to know how often weight in a certain position brings the 
cumulative sums equal or above the given threshold 'q'. In other words, 
how often each weight is decisive in raising the cumulative sum above 'q'?

Here is what I do:

w - c(3,2,1)  # vector of weights
q - 4  # theshold

# computes which coordinate of w is decisive in each permutation
res - sapply( permn(w), function(x) which(w == x[min(which(cumsum(x) = 
q))]) )

# complies the frequencies
prop.table( tabulate( res ))


The problem I have is that when the weights are not unique, the which() 
function returns a list as opposed to a vector. I don’t know how to 
proceed when this happens, as tabulate does not work on lists.

The answer, of course, should be that equal weights are “decisive” equally 
often.


Can you help?
Thanks a lot!

Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on a combinatorial task (lists?)

2009-08-11 Thread Serguei Kaniovski
Simple unlist() will not do. In case of repeated weights, unlike 
permutations of indices permn(1:length(w)) some permutations of weights 
are identical.


E.g. w - c(3,2,2), permutations of indices c(1,2,3) and c(1,3,2) are 
undistinguishable.


I think I have corrected the algorithm, but now I stuck with a rather 
trivial list manipulation at the very end.


library(combinat)

w - c(5,3,2,1)
i - 1:length(w)
q - 7

res - sapply( permn(i), function(x) min(which(cumsum(w[x]) =q)) )

Now I have the vector 'res' ( of size length(permn(i)) ), and I need to 
extract from each entry of the list produced by permn(i) the element 
with the index stored in 'res'.


E.g. the first three entries:

 permn(i)[1:3]
[[1]]
[1] 1 2 3 4

[[2]]
[1] 1 2 4 3

[[3]]
[1] 1 4 2 3

...

 res[1:3]
[1] 2 2 3

...

The answer should be 3, 4, 3 ...

Thanks again for you help!

Serguei K


jim holtman schrieb:
 Does 'unlist' do it for you:

 w - c(3,3,2,1)  # vector of weights
 q - 4  # theshold

 # computes which coordinate of w is decisive in each permutation
 res - unlist(sapply( permn(w), function(x) which(w == 
x[min(which(cumsum(x) =q))]) ))


 # complies the frequencies
 prop.table( tabulate( res ))
 [1] 0.4 0.4 0.1 0.1


 On Tue, Aug 11, 2009 at 7:03 AM, Serguei
 Kaniovskiserguei.kaniov...@wifo.ac.at wrote:
 Hello!
 I have the following combinatorial problem.
 Consider the cumulative sums of all permutations of a given weight 
vector

 'w'. I need to know how often weight in a certain position brings the
 cumulative sums equal or above the given threshold 'q'. In other words,
 how often each weight is decisive in raising the cumulative sum 
above 'q'?


 Here is what I do:

 w - c(3,2,1)  # vector of weights
 q - 4  # theshold

 # computes which coordinate of w is decisive in each permutation
 res - sapply( permn(w), function(x) which(w == x[min(which(cumsum(x) =
 q))]) )

 # complies the frequencies
 prop.table( tabulate( res ))


 The problem I have is that when the weights are not unique, the which()
 function returns a list as opposed to a vector. I don’t know how to
 proceed when this happens, as tabulate does not work on lists.

 The answer, of course, should be that equal weights are “decisive” 
equally

 often.


 Can you help?
 Thanks a lot!

 Serguei Kaniovski

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with contrast

2009-07-29 Thread Serguei Kaniovski


Hello All,

I am trying to estimate a generalized linear model using a single dummy
variable (bilat). I want to use contr.sum, in which (please correct me if I
am wrong) the implicit coefficient on the contrast equals the negative of
the sum of all estimated coefficients.

I cannot get the contrast to work correctly. The implicit coefficient is
way to large (I checked this using the desmat - option dev - package in
Stata). I do:

contrasts(bilat) - contr.sum(levels(bilat))
mod_1 - gamlss(d ~ bilat, family=BEINF, data=df)

but this problem is not specific to gamlss. The same problem occurs also,
e.g. in
contrasts(bilat) - contr.sum(levels(bilat))
mod_2 - lm(d ~ bilat, data=df)

What am I doing wrong? Thank you for help!

Serguei


Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: serguei.kaniov...@wifo.ac.at
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compute correlation matrix for panel data with specific ordering

2009-06-29 Thread Serguei Kaniovski
I apologize for not being specific enough in my previous posting. Assume 
you have panel data in the form:


df - data.frame( cbind( rep( c( AUT , BEL , DEN , GER ) , 4) , 
cbind( rep( c( 1999 , 2000 , 2001 , 2002 ) , 4 ) ), sample( 10 , 16 , 
replace=T) ) )

names(df) - c( country , year , x )

1. I would like to compute the correlation matrix between countries 
based on the annual observations of the variable x. I tried the following:

library( combinat )

temp - split( df$x, df$year )
apply( combn(4,2) , 2 , function(x) cor( temp[[1]] , temp[[2]] ) )

This gives wrong answer. Why?

2. The pairwise correlations computed as above should be in the order:

GER with BEL, GER with DEN, GER with AUT, BEL with DEN, BEL with AUT, 
DEN with AUT.


That is, the correctly sorted vector of factors is:

SORT - c( GER , BEL , DEN , AUT ) not c( AUT , BEL , DEN 
, GER )


May be there is an altogether better way of achieving what I want?

Serge

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compute correlation matrix for panel data with, specific ordering

2009-06-28 Thread Serguei Kaniovski
Ok I see how to sort the factors, but how do I compute the correlation 
matrix in a repeated observations dataset (see the first part of my 
question)


Thanks again for your help!
Serge

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compute correlation matrix for panel data with specific ordering

2009-06-27 Thread Serguei Kaniovski

Hello All,

I have a panel date - here a small-scale example:

df - 
data.frame(cbind(rep(c(AUT,BEL,DEN,GER),4),cbind(rep(c(1999,2000,2001,2002),4)),sample(10,16,replace=T)))

names(df) - c(country,year,x)

SORT - c(GER,BEL,DEN,AUT)

I need to compute the correlation between countries in the variable x 
in such a way that the rows  columns of the resulting correlation 
matrix are not in an alphabetical order but in the order of a given 
factor vector - here SORT.


How can I do this? Greatly appreciate any help!

Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constrained corr matrix closest to a given corr matrix

2009-06-26 Thread Serguei Kaniovski

Dear All!

Is there any code to find a constrained correlation matrix closest in 
some sense (preferably in the sense of the geometric approximation) to a 
given correlation matrix? By constrained I mean with some elements 
constrained, or, simply, set to zero. I have read in a paper that says 
this can be accomplished with Lagrange multipliers and an optimization 
routine.


Thanks a lot!
Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linear constraints for constrasts

2009-05-20 Thread Serguei Kaniovski

Dear List!

How can I define contrasts (design matrix) that can all be included, 
i.e. which do not require a control category be dropped. My application 
(see below) does not suggest a sensible control category. I am thinking 
of constraining the (treatment) contrasts to sum up to zero and dropping 
the constant term in the regression. Is this a good idea? If yes, how to 
achieve this in R?


I am estimating a GLM for bilateral country data. Each observation in on 
a pair of countries, e.g. GER_USA, GER_JAP, USA_JAP. I constructed the 
following contrasts: d_GER, d_USA, d_JAP, which take the value of 1 when 
the country is in the pair and 0 otherwise, i.e.

“Bilat”, “d_GER”, “d_USA”, “d_JAP”
GER_USA, 1, 1, 0
GER_JAP, 1, 0, 1
USA_JAP, 0, 1, 1
These contrasts highlight the effect of having a given country in the pair.

Thank you for your help!
Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dummy (factor) based on a pair of variables

2009-04-18 Thread Serguei Kaniovski


Dear All!

my data is on pairs of countries, i and j, e.g.:

y,i,j
1,AUT,BEL
2,AUT,GER
3,BEL,GER

I would like to create a dummy (indicator) variable for use in regression
(using factor?), such that it takes the value of 1 if the country is in the
pair (i.e. EITHER an i-country OR an j-country).

Thank you for your help,
Serguei

Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: serguei.kaniov...@wifo.ac.at
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dummy (factor) based on a pair of variables

2009-04-18 Thread Serguei Kaniovski

Bernardo: this is not quite what I am looking for,

Let the data be:
y,i,j
1,AUT,BEL
2,AUT,GER
3,BEL,GER

then the dummies sould look like:

y,i,j,d_AUT,d_BEL,d_GER
1,AUT,BEL,1,1,0
2,AUT,GER,1,0,1
3,BEL,GER,0,1,1

I can generate the above dummies but can this design be imputed in a 
reg. model directly?


Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Implementing a linear restriction in lm()

2008-12-24 Thread Serguei Kaniovski


Dear All!

I want to test a coeffcient restriction beta=1 in a univariate model lm
(y~x). Entering
lm((y-x)~1) does not help since anova test requires the same dependent
variable. What is the right way to proceed?

Thank you for your help and marry xmas,
Serguei Kaniovski

Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: serguei.kaniov...@wifo.ac.at
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vertex enumeration and center of mass for convex polytops

2008-10-08 Thread Serguei Kaniovski

Hi Mosche,

In my problem the polytope is defined not by its vertices, but by a set of 
linear equations and inequalities.

Serguei



Moshe Olshansky schrieb:

Hi,

If you know that all your points represent vertices of a convex polygon you do 
not need any special package.
The center of mass is just the mean of the coordinates.
To enumerate the vertices, compute the vectors from the center of mass to all 
the vertices. Using atan2 function compute the arguments of all these vectors 
(between 0 and 2*pi) and number the points according to their argument.


--- On Wed, 1/10/08, Serguei Kaniovski [EMAIL PROTECTED] wrote:

  

From: Serguei Kaniovski [EMAIL PROTECTED]
Subject: [R] Vertex enumeration and center of mass for convex polytops
To: [EMAIL PROTECTED]
Received: Wednesday, 1 October, 2008, 10:00 PM
Dear All!

I am looking for a package that contains routines for
vertex enumeration 
and center of mass computation for convex polytops.


Thanks in advance,
Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vertex enumeration and center of mass for convex polytops

2008-10-01 Thread Serguei Kaniovski
Dear All!

I am looking for a package that contains routines for vertex enumeration 
and center of mass computation for convex polytops.

Thanks in advance,
Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitdistr of a 3 parameter Gamma distribution

2008-06-24 Thread Serguei Kaniovski
Hello,

how can I fit a three, as opposed to a two parameter Gamma distribution, 
i.e. one in which shape and rate are not tied.

Thanks,
Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kappa distribution

2008-05-29 Thread Serguei Kaniovski
Hallo all,

I am looking for an R implementation of the four parameter kappa 
distribution (cdf, pdf, quantile, and ransom numbers), or at a minimum, 
the generalized logistic distribution.

Any suggestions?

Thank you very much,
Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with factors

2008-05-14 Thread Serguei Kaniovski
Hallo All,

I have difficulties understanding how factors work in R. Suppose a have 
data in the panel form below. I would to compute a correlation coefficient 
(actually apply a different function of two time series) in the V variable 
between members of the two sexes in each city over time. How can this be 
done?

Thank you in advance,
Serguei

city, year, sex, V
1, 1975, 1, 25.3044
1, 1975, 0, 16.5711
1, 1976, 0, 16.6072
1, 1976, 1, 24.2841
1, 1977, 0, 14.8838
1, 1977, 1, 24.8124
1, 1978, 1, 23.0570
1, 1978, 0, 14.5627
1, 1979, 1, 21.2071
1, 1979, 0, 13.5277
2, 1975, 1, 62.4457
2, 1975, 0, 26.9745
2, 1976, 1, 67.3025
2, 1976, 0, 31.4600
2, 1977, 1, 53.0577
2, 1977, 0, 25.1941
2, 1978, 0, 23.3694
2, 1978, 1, 40.1452
2, 1979, 1, 44.5686
2, 1979, 0, 23.4042

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Correlation matrix for data in long format

2008-01-29 Thread Serguei Kaniovski
Hello,

I cannot figure out how to use tapply to compute the correlation matrix 
in the variable x between the states? The data is in long format, e.g.:

state,year,x
Alabama,2001,0.45
Alabama,2002,0.47
Alabama,2003,0.48
Alabama,2004,0.44
Arizona,2001,0.34
Arizona,2002,0.32
Arizona,2003,0.38
Arizona,2004,0.36

Thank you in advance for your help,
Serguei Kaniovski

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] between sum of squares for clusters

2008-01-16 Thread Serguei Kaniovski


Hello,

how to compute the between sum of squares the for the clusters obtained by
kmeans for the example below? I would like to compute the SSB-SSW based
information measures for the clusters, as implemented in cclust, but I am
getting an error message.

Serguei

x1,x2,x3,x4,x5,cl
1.3,0.2,-1.2,3.2,-2.5,1
1.3,-0.4,-1.2,2.8,-1.8,1
-1.4,-0.3,0.7,-0.9,0.5,2
-1,-0.6,1.3,-0.6,0.9,2
-1.4,-0.1,0.8,-0.5,1.1,2
-1,-0.6,0.9,-0.4,1.4,2
0.8,0.9,-1,0.3,-1.1,3
-1.4,-0.8,2.8,-1.1,1.1,4
-0.5,-0.3,-0.1,0.5,-0.3,5
-0.1,1,0,0.1,0.2,5
-0.1,-0.3,-0.4,-0.2,0.4,5
-1.4,2.8,0.7,-0.9,0.6,6
-1,1.8,0.6,-1,1.7,6
0.4,-0.4,0.9,-0.7,0.4,7
0.4,-0.4,0.8,-0.6,0.8,7
0.4,-0.6,0.7,-0.2,1,7
-1.4,1.9,0.6,-0.4,-0.2,8
-1.4,1.7,0.3,-0.4,-0.1,8
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Quick question on kmeans

2008-01-15 Thread Serguei Kaniovski
Hello,

when kmeans draws random vectors for the initial centroids, does it then 
select the clustering that has emerged most frequently?

Serguei

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Permutations of variables in a dataframe

2008-01-14 Thread Serguei Kaniovski
Hallo All,

I would like to apply a function to all permutations of variables in a 
dataframe (except the first). What is the best way to achieve this?

I produce the permutations using:

nvar - ncol(dat) - 1
perms - as.matrix( expand.grid(rep( list(1:0) , nvar ))[ , nvar:1] )

Thanks in advance
Serguei

Test-dataframe, comma-delimited:

code,wav,w,area,gdp,def,pop,coast,milspend,agr
aut,5,10,83.87,26.39,-1.29,8.07,0,0.72,1.81
bel,1,12,30.53,24.87,-0.28,10.29,0.07,1.29,1.09
bul,7,10,110.91,2.14,1.22,8.03,0.35,1.46,10.88
cyp,6,4,9.25,14.65,-3.26,0.7,0.65,2.11,3.2
cze,6,12,78.87,6.88,-4.44,10.26,0,2,3.19
dnk,2,7,43.09,32.75,2.05,5.34,7.31,1.53,1.98
est,6,4,45.23,5.15,0.82,1.38,3.79,1.58,3.91
fin,5,7,338.15,25.5,3.52,5.18,1.25,1.33,2.87
fra,1,29,547.03,23.99,-2.63,61.05,3.43,2.61,2.39

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Information criteria for kmeans

2007-12-05 Thread Serguei Kaniovski

Hello,

how is, for example, the Schwarz criterion is defined for kmeans? It should
be something like:

k - 2
vars - 4
nobs - 100

dat - rbind(matrix(rnorm(nobs, sd = 0.3), ncol = vars),
   matrix(rnorm(nobs, mean = 1, sd = 0.3), ncol = vars))

colnames(dat) - paste(var,1:4)

(cl - kmeans(dat, k))

schwarz - sum(cl$withinss)+ vars*k*log(nobs)

Thanks for your help,
Serguei

Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: [EMAIL PROTECTED]
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Inserting a subsequence between values of a vector

2007-12-04 Thread Serguei Kaniovski

Hallo,

suppose I have a vector:

x - c(1,1,1,2,2,3,3,3,3,3,4)

How can I generate a vector/sequence in which a fixed number of zeroes (say
3) is inserted between the consecutive values, so I get

1,1,1,0,0,0,2,2,0,0,0,3,3,3,3,3,0,0,0,4

thanks a lot,
Serguei
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Order observations in a dataframe

2007-11-28 Thread Serguei Kaniovski

Dear All,

Suppose I have the following dataframe:

country;weight;group
bul;10;1
cze;12;1
grc;12;1
hun;12;1
prt;12;1
rom14;1
fra;29;2
ita;29;2
gbr;29;2
aut;10;3
bel;12;3

The group variable denotes the id-number of a group of countries. How can
I re-label the groups in the descending order of their cumulative weight,
which wound be:

country;weight;group
fra;29;1
ita;29;1
gbr;29;1
bul;10;2
cze;12;2
grc;12;2
hun;12;2
prt;12;2
rom14;2
aut;10;3
bel;12;3

The first group has the largest sum of weights, the second, with the second
largest, and so on.

Thanks for your help,
Serguei Kaniovski


Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: [EMAIL PROTECTED]
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replacing values job

2007-11-28 Thread Serguei Kaniovski

Hallo,

I have two vectors of different lengths which contain the same set of
values:

X  -c(2,6,1,7,4,3,5)
Y - c(1,1,6,4,6,1,4,1,2,3,6,6,1,2,4,4,5,4,1,7,6,6,4,4,7,1,2)

How can I replace the values in Y with the index (!) of the corresponding
values in X. So 2 appears in X in the first coordinate, so all 2’s in Y
should be replaced by 1, etc.

Thank you for your help,
Serguei


Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: [EMAIL PROTECTED]
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Vectorize a correlation matrix

2007-11-22 Thread Serguei Kaniovski

Hello

I can construct a correlation matrix from an (ordered) vector of
correlation coefficients as follows:

x - c(0.1,0.2,0.3,0.4,0.5)
n - length(x)
cmat - diag(rep(0.5,n))
cmat[lower.tri(cmat,diag=0)] - x
cmat - cmat+t(cmat)

But how to do the reverse operation, i.e. produce x from cmat?

Thanks for help,
Serguei Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix of dummies from a vector

2007-11-22 Thread Serguei Kaniovski

Hallo

From a variable x that defines, say, four classes, I would like to define
the matrix mat of dummy variables indicating the classes, i.e.

x - c(1,1,1,1,2,2,2,3,3,3,4,4)
mat - matrix(c(1,0,0,0,
1,0,0,0,
1,0,0,0,
1,0,0,0,
0,1,0,0,
0,1,0,0,
0,1,0,0,
0,0,1,0,
0,0,1,0,
0,0,1,0,
0,0,0,1,
0,0,0,1), ncol=4, byrow=T)

Thank you for your help,
Serguei
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to map clusters to a correlation matrix

2007-11-20 Thread Serguei Kaniovski

Dear All,

I have several socio-economic and geographic variables for the 27 EU
countries. I would to use these data to derive a correlation matrix between
groups of countries (for a different application).

I thought of using kmeans to cluster the groups, and then calibrate between
group correlations using distances between the centroids, and within group
correlations using distances in a cluster to the own centroid. To calibrate
is to transform a distance to a (positive) correlation coefficient using
some suitable function. Positive correlations reflect the strength of
common tendencies among the countries.

All the above seems crude to me, especially as you have to choice a
transformation function for a distance to a correlation coefficient. Are
there any better methods to do this?

Thanks is advance,
Serguei
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to map clusters to a correlation matrix

2007-11-20 Thread Serguei Kaniovski


To Walter,

I am building a model of voting in the EU Council of Ministers. A voting
scenario assumes the probabilities of yes votes, possible different for
each country, and correlation coefficient between them.

The country variables (data) reflect the background characteristics that
are relevant for the likelihood of bloc formation in the Council. I would
like to cluster the countries in what I call probabilistic voting blocs,
and construct a correlation matrix between votes in different blocs and
between votes in each bloc.

I then use the probabilities and correlation coefficients to construct a
joint probability distribution on the set of all conceivable voting
outcomes and thus compute the probabilities of voting outcomes that are of
interest, such as probability of a country casting a decisive vote, etc.

Serguei

Austrian Institute of Economic Research (WIFO)

P.O.Box 91  Tel.: +43-1-7982601-231
1103 Vienna, AustriaFax: +43-1-7989386

Mail: [EMAIL PROTECTED]
http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.