Re: [R] Aggregation across two variables in data.table

2017-12-14 Thread PIKAL Petr
Hi

Are you aware of function aggregate?

result <- with(data_tmp, aggregate(Theta, list(Marital, Education), mean))

should do the trick.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Michael
> Haenlein
> Sent: Thursday, December 14, 2017 8:48 AM
> To: r-help@r-project.org
> Cc: Michael Haenlein <haenl...@escpeurope.eu>
> Subject: [R] Aggregation across two variables in data.table
>
> Dear all,
>
> I have a data.frame that includes a series of demographic variables for a
> set of respondents plus a dependent variable (Theta). For example:
>
>AgeEducation   Marital Familysize
> IncomeHousingTheta
> 1:  50 Associate degree  Divorced  4
>  70K+Owned with mortgage 9.14
> 2:  65  Bachelor degree   Married  1
> 10-15K Owned without mortgage 7.345036
> 3:  33  Bachelor degree   Married  2
> 30-40KOwned with mortgage 7.974937
> 4:  69  Bachelor degree Never married  1
>  70K+Owned with mortgage 7.733053
> 5:  54 Some college, less than college graduate Never married  3
> 30-40K Rented 7.648642
> 6:  35 Associate degree Separated  2
> 10-15K Rented 7.496411
>
> My objective is to calculate the average of Theta across all pairs of two
> demographics.
>
> For 1 demographic this is straightforward:
>
> Demo_names <- c("Age", "Education", "Marital", "Familysize", "Income",
> "Housing")
> means1 <- as.list(rep(0, length(Demo_names)))
> for (i in 1:length(Demo_names)) {
> Demo_tmp <- Demo_names[i]
> means1[[i]] <- data_tmp[,list(mean(Theta)),by=Demo_tmp]}
>
> Is there an easy way to extent this logic to more than 1 variable? I know
> how to do this manually, e.g.,
> data_tmp[,list(mean(Theta)),by=list(Marital, Education)]
>
> But I don't know how to integrate this into a loop.
>
> Thanks,
>
> Michael
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any a

[R] Aggregation across two variables in data.table

2017-12-13 Thread Michael Haenlein
Dear all,

I have a data.frame that includes a series of demographic variables for a
set of respondents plus a dependent variable (Theta). For example:

   AgeEducation   Marital Familysize
IncomeHousingTheta
1:  50 Associate degree  Divorced  4
 70K+Owned with mortgage 9.14
2:  65  Bachelor degree   Married  1
10-15K Owned without mortgage 7.345036
3:  33  Bachelor degree   Married  2
30-40KOwned with mortgage 7.974937
4:  69  Bachelor degree Never married  1
 70K+Owned with mortgage 7.733053
5:  54 Some college, less than college graduate Never married  3
30-40K Rented 7.648642
6:  35 Associate degree Separated  2
10-15K Rented 7.496411

My objective is to calculate the average of Theta across all pairs of two
demographics.

For 1 demographic this is straightforward:

Demo_names <- c("Age", "Education", "Marital", "Familysize", "Income",
"Housing")
means1 <- as.list(rep(0, length(Demo_names)))
for (i in 1:length(Demo_names)) {
Demo_tmp <- Demo_names[i]
means1[[i]] <- data_tmp[,list(mean(Theta)),by=Demo_tmp]}

Is there an easy way to extent this logic to more than 1 variable? I know
how to do this manually, e.g.,
data_tmp[,list(mean(Theta)),by=list(Marital, Education)]

But I don't know how to integrate this into a loop.

Thanks,

Michael

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.