Re: [R] weighted average grouped by variables

2017-11-12 Thread PIKAL Petr
Hi Berend

Yes you are correct. My fault, I did not test it before sending.

Cheers
Petr

> -Original Message-
> From: Berend Hasselman [mailto:b...@xs4all.nl]
> Sent: Saturday, November 11, 2017 11:26 AM
> To: PIKAL Petr <petr.pi...@precheza.cz>
> Cc: Massimo Bressan <massimo.bres...@arpa.veneto.it>; r-help <r-help@r-
> project.org>
> Subject: Re: [R] weighted average grouped by variables
>
>
> > On 9 Nov 2017, at 14:58, PIKAL Petr <petr.pi...@precheza.cz> wrote:
> >
> > Hi
> >
> > Thanks for working example.
> >
> > you could use split/ lapply approach, however it is probably not much better
> than dplyr method.
> >
> > sapply(split(mydf, mydf$type), function(speed, n_vehicles)
> > sum(mydf$speed*mydf$n_vehicles)/sum(mydf$n_vehicles))
> > gives you averages
> >
>
> The result of this calculation is:
>
>  car light_duty heavy_duty motorcycle
>   36.54109   36.54109   36.54109   36.54109
>
> But this doesn't give the same result as the dplyr method which is:
>
> date_time   type  vel
> 
> 1 2017-10-17 13:00:00car 36.39029
> 2 2017-10-17 13:00:00 light_duty 38.56522
> 3 2017-10-17 13:00:00 heavy_duty 37.5
> 4 2017-10-17 13:00:00 motorcycle 36.08696
>
> The base R way of getting the result should be modified slightly into
>
> sapply(split(mydf, mydf$type), function(Z)
> sum(Z$speed*Z$n_vehicles)/sum(Z$n_vehicles))
>
> Calculations are done on the elements of the list provided by split.
> The result now is:
>
>   car light_duty heavy_duty motorcycle
>   36.39029   38.56522   37.5   36.08696
>
> Obviously now the same as the dplyr method.
>
> Berend Hasselman
>
> > aggregate(mydf$n_vehicles, list(mydf$type), sum)$x gives you sums
> >
> > Cheers
> > Petr
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> >> Massimo Bressan
> >> Sent: Thursday, November 9, 2017 2:17 PM
> >> To: r-help <r-help@r-project.org>
> >> Subject: Re: [R] weighted average grouped by variables
> >>
> >> Hello
> >>
> >> an update about my question: I worked out the following solution
> >> (with the package "dplyr")
> >>
> >> library(dplyr)
> >>
> >> mydf%>%
> >> mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
> >> group_by(date_time,type) %>%
> >> summarise(
> >> sum_n_times_speed=sum(speed_vehicles),
> >> n_vehicles=sum(n_vehicles),
> >> vel=sum(speed_vehicles)/sum(n_vehicles)
> >> )
> >>
> >>
> >> In fact I was hoping to manage everything in a "one-go": i.e. without
> >> the need to create the "intermediate" variable called
> >> "speed_vehicles" and with the use of the function weighted.mean()
> >>
> >> any hints for a different approach much appreciated
> >>
> >> thanks
> >>
> >>
> >>
> >> Da: "Massimo Bressan" <massimo.bres...@arpa.veneto.it>
> >> A: "r-help" <r-help@r-project.org>
> >> Inviato: Giovedì, 9 novembre 2017 12:20:52
> >> Oggetto: weighted average grouped by variables
> >>
> >> hi all
> >>
> >> I have this dataframe (created as a reproducible example)
> >>
> >> mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
> >> 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class =
> >> c("POSIXct", "POSIXt"), tzone = ""), direction = structure(c(1L, 1L,
> >> 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), type =
> >> structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car",
> >> "light_duty", "heavy_duty", "motorcycle"), class = "factor"),
> >> avg_speed = c(41.1029082774049, 40.3, 40.3157894736842,
> >> 36.0869565217391, 33.4065155807365, 37.6, 35.5),
> >> n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)), .Names =
> >> c("date_time", "direction", "type", "speed", "n_vehicles"), row.names
> >> = c(NA, -7L), class = "data.frame")
> >>
> >> mydf
> >>
> >> and I need to get to this final result
> >>
> >> mydf_final<-structure(list(date_time = structure(c(1508238000,
> >> 1508238000, 150823

Re: [R] weighted average grouped by variables

2017-11-11 Thread Berend Hasselman

> On 9 Nov 2017, at 14:58, PIKAL Petr <petr.pi...@precheza.cz> wrote:
> 
> Hi
> 
> Thanks for working example.
> 
> you could use split/ lapply approach, however it is probably not much better 
> than dplyr method.
> 
> sapply(split(mydf, mydf$type), function(speed, n_vehicles) 
> sum(mydf$speed*mydf$n_vehicles)/sum(mydf$n_vehicles))
> gives you averages
> 

The result of this calculation is:

 car light_duty heavy_duty motorcycle 
  36.54109   36.54109   36.54109   36.54109 

But this doesn't give the same result as the dplyr method which is:

date_time   type  vel

1 2017-10-17 13:00:00car 36.39029
2 2017-10-17 13:00:00 light_duty 38.56522
3 2017-10-17 13:00:00 heavy_duty 37.5
4 2017-10-17 13:00:00 motorcycle 36.08696

The base R way of getting the result should be modified slightly into

sapply(split(mydf, mydf$type), function(Z) 
sum(Z$speed*Z$n_vehicles)/sum(Z$n_vehicles))

Calculations are done on the elements of the list provided by split.
The result now is:

  car light_duty heavy_duty motorcycle 
  36.39029   38.56522   37.5   36.08696 

Obviously now the same as the dplyr method.

Berend Hasselman

> aggregate(mydf$n_vehicles, list(mydf$type), sum)$x
> gives you sums
> 
> Cheers
> Petr
> 
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Massimo
>> Bressan
>> Sent: Thursday, November 9, 2017 2:17 PM
>> To: r-help <r-help@r-project.org>
>> Subject: Re: [R] weighted average grouped by variables
>> 
>> Hello
>> 
>> an update about my question: I worked out the following solution (with the
>> package "dplyr")
>> 
>> library(dplyr)
>> 
>> mydf%>%
>> mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
>> group_by(date_time,type) %>%
>> summarise(
>> sum_n_times_speed=sum(speed_vehicles),
>> n_vehicles=sum(n_vehicles),
>> vel=sum(speed_vehicles)/sum(n_vehicles)
>> )
>> 
>> 
>> In fact I was hoping to manage everything in a "one-go": i.e. without the 
>> need
>> to create the "intermediate" variable called "speed_vehicles" and with the 
>> use
>> of the function weighted.mean()
>> 
>> any hints for a different approach much appreciated
>> 
>> thanks
>> 
>> 
>> 
>> Da: "Massimo Bressan" <massimo.bres...@arpa.veneto.it>
>> A: "r-help" <r-help@r-project.org>
>> Inviato: Giovedì, 9 novembre 2017 12:20:52
>> Oggetto: weighted average grouped by variables
>> 
>> hi all
>> 
>> I have this dataframe (created as a reproducible example)
>> 
>> mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
>> 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class =
>> c("POSIXct", "POSIXt"), tzone = ""), direction = structure(c(1L, 1L, 1L, 1L, 
>> 2L, 2L,
>> 2L), .Label = c("A", "B"), class = "factor"), type = structure(c(1L, 2L, 3L, 
>> 4L, 1L,
>> 2L, 3L), .Label = c("car", "light_duty", "heavy_duty", "motorcycle"), class =
>> "factor"), avg_speed = c(41.1029082774049, 40.3,
>> 40.3157894736842, 36.0869565217391, 33.4065155807365,
>> 37.6, 35.5), n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 
>> 26L)),
>> .Names = c("date_time", "direction", "type", "speed", "n_vehicles"), 
>> row.names
>> = c(NA, -7L), class = "data.frame")
>> 
>> mydf
>> 
>> and I need to get to this final result
>> 
>> mydf_final<-structure(list(date_time = structure(c(1508238000, 1508238000,
>> 1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""), type =
>> structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", "heavy_duty",
>> "motorcycle"), class = "factor"), weighted_avg_speed = c(36.39029, 38.56521,
>> 37.5, 36.08696), n_vehicles = c(1153L,69L,45L,23L)), .Names =
>> c("date_time", "type", "weighted_avg_speed", "n_vehicles"), row.names =
>> c(NA, -4L), class = "data.frame")
>> 
>> mydf_final
>> 
>> 
>> my question:
>> how to compute a weighted mean i.e. "weighted_avg_speed"
>> from "speed" (the values whose weighted mean is to be computed) and
>> "n_vehicles" (the weights) grouped by "date_time" and "ty

Re: [R] weighted average grouped by variables

2017-11-09 Thread Thierry Onkelinx
Dear Massimo,

It seems straightforward to use weighted.mean() in a dplyr context

library(dplyr)
mydf %>%
  group_by(date_time, type) %>%
  summarise(vel = weighted.mean(speed, n_vehicles))

Best regards,



ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Kliniekstraat 25, B-1070 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///

[image: Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging
in Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis. Vanaf
dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel.]

Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in
Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis.
Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000
Brussel.

///


2017-11-09 14:16 GMT+01:00 Massimo Bressan :

> Hello
>
> an update about my question: I worked out the following solution (with the
> package "dplyr")
>
> library(dplyr)
>
> mydf%>%
> mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
> group_by(date_time,type) %>%
> summarise(
> sum_n_times_speed=sum(speed_vehicles),
> n_vehicles=sum(n_vehicles),
> vel=sum(speed_vehicles)/sum(n_vehicles)
> )
>
>
> In fact I was hoping to manage everything in a "one-go": i.e. without the
> need to create the "intermediate" variable called "speed_vehicles" and with
> the use of the function weighted.mean()
>
> any hints for a different approach much appreciated
>
> thanks
>
>
>
> Da: "Massimo Bressan" 
> A: "r-help" 
> Inviato: Giovedì, 9 novembre 2017 12:20:52
> Oggetto: weighted average grouped by variables
>
> hi all
>
> I have this dataframe (created as a reproducible example)
>
> mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
> 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class =
> c("POSIXct", "POSIXt"), tzone = ""),
> direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"),
> class = "factor"),
> type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car",
> "light_duty", "heavy_duty", "motorcycle"), class = "factor"),
> avg_speed = c(41.1029082774049, 40.3, 40.3157894736842,
> 36.0869565217391, 33.4065155807365, 37.6, 35.5),
> n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)),
> .Names = c("date_time", "direction", "type", "speed", "n_vehicles"),
> row.names = c(NA, -7L),
> class = "data.frame")
>
> mydf
>
> and I need to get to this final result
>
> mydf_final<-structure(list(date_time = structure(c(1508238000,
> 1508238000, 1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone
> = ""),
> type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty",
> "heavy_duty", "motorcycle"), class = "factor"),
> weighted_avg_speed = c(36.39029, 38.56521, 37.5, 36.08696),
> n_vehicles = c(1153L,69L,45L,23L)),
> .Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"),
> row.names = c(NA, -4L),
> class = "data.frame")
>
> mydf_final
>
>
> my question:
> how to compute a weighted mean i.e. "weighted_avg_speed"
> from "speed" (the values whose weighted mean is to be computed) and
> "n_vehicles" (the weights)
> grouped by "date_time" and "type"?
>
> to be noted the complication of the case "motorcycle" (not present in both
> directions)
>
> any help for that?
>
> thank you
>
> max
>
>
>
> --
>
> 
> Massimo Bressan
>
> ARPAV
> Agenzia Regionale per la Prevenzione e
> Protezione Ambientale del Veneto
>
> Dipartimento Provinciale di Treviso
> Via Santa Barbara, 5/a
> 31100 Treviso, Italy
>
> tel: +39 0422 558545
> fax: +39 0422 558516
> e-mail: massimo.bres...@arpa.veneto.it
> 
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> 

Re: [R] weighted average grouped by variables

2017-11-09 Thread Massimo Bressan
hi thierry 

thanks for your reply 

yes, you are right, your solution is more straightforward 

best 


Da: "Thierry Onkelinx" <thierry.onkel...@inbo.be> 
A: "Massimo Bressan" <massimo.bres...@arpa.veneto.it> 
Cc: "r-help" <r-help@r-project.org> 
Inviato: Giovedì, 9 novembre 2017 15:17:31 
Oggetto: Re: [R] weighted average grouped by variables 

Dear Massimo, 

It seems straightforward to use weighted.mean() in a dplyr context 

library(dplyr) 
mydf %>% 
group_by(date_time, type) %>% 
summarise(vel = weighted.mean(speed, n_vehicles)) 

Best regards, 



ir. Thierry Onkelinx 
Statisticus / Statistician 

Vlaamse Overheid / Government of Flanders 
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND 
FOREST 
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance 
thierry.onkel...@inbo.be 
Kliniekstraat 25, B-1070 Brussel 
www.inbo.be 

///
 
To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of. ~ Sir Ronald Aylmer Fisher 
The plural of anecdote is not data. ~ Roger Brinner 
The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data. ~ John 
Tukey 
///
 


Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in Brussel 
naar het Herman Teirlinckgebouw op de site Thurn & Taxis. 
Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel. 

///
 



-- 

 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bres...@arpa.veneto.it 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weighted average grouped by variables

2017-11-09 Thread PIKAL Petr
Hi

Thanks for working example.

you could use split/ lapply approach, however it is probably not much better 
than dplyr method.

sapply(split(mydf, mydf$type), function(speed, n_vehicles) 
sum(mydf$speed*mydf$n_vehicles)/sum(mydf$n_vehicles))
gives you averages

aggregate(mydf$n_vehicles, list(mydf$type), sum)$x
gives you sums

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Massimo
> Bressan
> Sent: Thursday, November 9, 2017 2:17 PM
> To: r-help <r-help@r-project.org>
> Subject: Re: [R] weighted average grouped by variables
>
> Hello
>
> an update about my question: I worked out the following solution (with the
> package "dplyr")
>
> library(dplyr)
>
> mydf%>%
> mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
> group_by(date_time,type) %>%
> summarise(
> sum_n_times_speed=sum(speed_vehicles),
> n_vehicles=sum(n_vehicles),
> vel=sum(speed_vehicles)/sum(n_vehicles)
> )
>
>
> In fact I was hoping to manage everything in a "one-go": i.e. without the need
> to create the "intermediate" variable called "speed_vehicles" and with the use
> of the function weighted.mean()
>
> any hints for a different approach much appreciated
>
> thanks
>
>
>
> Da: "Massimo Bressan" <massimo.bres...@arpa.veneto.it>
> A: "r-help" <r-help@r-project.org>
> Inviato: Giovedì, 9 novembre 2017 12:20:52
> Oggetto: weighted average grouped by variables
>
> hi all
>
> I have this dataframe (created as a reproducible example)
>
> mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
> 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class =
> c("POSIXct", "POSIXt"), tzone = ""), direction = structure(c(1L, 1L, 1L, 1L, 
> 2L, 2L,
> 2L), .Label = c("A", "B"), class = "factor"), type = structure(c(1L, 2L, 3L, 
> 4L, 1L,
> 2L, 3L), .Label = c("car", "light_duty", "heavy_duty", "motorcycle"), class =
> "factor"), avg_speed = c(41.1029082774049, 40.3,
> 40.3157894736842, 36.0869565217391, 33.4065155807365,
> 37.6, 35.5), n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)),
> .Names = c("date_time", "direction", "type", "speed", "n_vehicles"), row.names
> = c(NA, -7L), class = "data.frame")
>
> mydf
>
> and I need to get to this final result
>
> mydf_final<-structure(list(date_time = structure(c(1508238000, 1508238000,
> 1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""), type =
> structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", "heavy_duty",
> "motorcycle"), class = "factor"), weighted_avg_speed = c(36.39029, 38.56521,
> 37.5, 36.08696), n_vehicles = c(1153L,69L,45L,23L)), .Names =
> c("date_time", "type", "weighted_avg_speed", "n_vehicles"), row.names =
> c(NA, -4L), class = "data.frame")
>
> mydf_final
>
>
> my question:
> how to compute a weighted mean i.e. "weighted_avg_speed"
> from "speed" (the values whose weighted mean is to be computed) and
> "n_vehicles" (the weights) grouped by "date_time" and "type"?
>
> to be noted the complication of the case "motorcycle" (not present in both
> directions)
>
> any help for that?
>
> thank you
>
> max
>
>
>
> --
>
> 
> Massimo Bressan
>
> ARPAV
> Agenzia Regionale per la Prevenzione e
> Protezione Ambientale del Veneto
>
> Dipartimento Provinciale di Treviso
> Via Santa Barbara, 5/a
> 31100 Treviso, Italy
>
> tel: +39 0422 558545
> fax: +39 0422 558516
> e-mail: massimo.bres...@arpa.veneto.it
> 
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Ne

Re: [R] weighted average grouped by variables

2017-11-09 Thread Rui Barradas
Sorry, I messed up. Only checked the final result after sending the 
previous mail. The solution is wrong.


Rui Barradas

Em 09-11-2017 13:27, Rui Barradas escreveu:

Hello,

Using base R only, the following seems to do what you want.

with(mydf, ave(speed, date_time, type, FUN = weighted.mean, w =
n_vehicles))


Hope this helps,

Rui Barradas

Em 09-11-2017 13:16, Massimo Bressan escreveu:

Hello

an update about my question: I worked out the following solution (with
the package "dplyr")

library(dplyr)

mydf%>%
mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
group_by(date_time,type) %>%
summarise(
sum_n_times_speed=sum(speed_vehicles),
n_vehicles=sum(n_vehicles),
vel=sum(speed_vehicles)/sum(n_vehicles)
)


In fact I was hoping to manage everything in a "one-go": i.e. without
the need to create the "intermediate" variable called "speed_vehicles"
and with the use of the function weighted.mean()

any hints for a different approach much appreciated

thanks



Da: "Massimo Bressan" 
A: "r-help" 
Inviato: Giovedì, 9 novembre 2017 12:20:52
Oggetto: weighted average grouped by variables

hi all

I have this dataframe (created as a reproducible example)

mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class =
c("POSIXct", "POSIXt"), tzone = ""),
direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"),
type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car",
"light_duty", "heavy_duty", "motorcycle"), class = "factor"),
avg_speed = c(41.1029082774049, 40.3, 40.3157894736842,
36.0869565217391, 33.4065155807365, 37.6, 35.5),
n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)),
.Names = c("date_time", "direction", "type", "speed", "n_vehicles"),
row.names = c(NA, -7L),
class = "data.frame")

mydf

and I need to get to this final result

mydf_final<-structure(list(date_time = structure(c(1508238000,
1508238000, 1508238000, 1508238000), class = c("POSIXct", "POSIXt"),
tzone = ""),
type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty",
"heavy_duty", "motorcycle"), class = "factor"),
weighted_avg_speed = c(36.39029, 38.56521, 37.5, 36.08696),
n_vehicles = c(1153L,69L,45L,23L)),
.Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"),
row.names = c(NA, -4L),
class = "data.frame")

mydf_final


my question:
how to compute a weighted mean i.e. "weighted_avg_speed"
from "speed" (the values whose weighted mean is to be computed) and
"n_vehicles" (the weights)
grouped by "date_time" and "type"?

to be noted the complication of the case "motorcycle" (not present in
both directions)

any help for that?

thank you

max





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weighted average grouped by variables

2017-11-09 Thread Rui Barradas

Hello,

Using base R only, the following seems to do what you want.

with(mydf, ave(speed, date_time, type, FUN = weighted.mean, w = n_vehicles))


Hope this helps,

Rui Barradas

Em 09-11-2017 13:16, Massimo Bressan escreveu:

Hello

an update about my question: I worked out the following solution (with the package 
"dplyr")

library(dplyr)

mydf%>%
mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
group_by(date_time,type) %>%
summarise(
sum_n_times_speed=sum(speed_vehicles),
n_vehicles=sum(n_vehicles),
vel=sum(speed_vehicles)/sum(n_vehicles)
)


In fact I was hoping to manage everything in a "one-go": i.e. without the need to create the 
"intermediate" variable called "speed_vehicles" and with the use of the function 
weighted.mean()

any hints for a different approach much appreciated

thanks



Da: "Massimo Bressan" 
A: "r-help" 
Inviato: Giovedì, 9 novembre 2017 12:20:52
Oggetto: weighted average grouped by variables

hi all

I have this dataframe (created as a reproducible example)

mydf<-structure(list(date_time = structure(c(1508238000, 1508238000, 1508238000, 1508238000, 1508238000, 
1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""),
direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = 
"factor"),
type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car", "light_duty", "heavy_duty", 
"motorcycle"), class = "factor"),
avg_speed = c(41.1029082774049, 40.3, 40.3157894736842, 
36.0869565217391, 33.4065155807365, 37.6, 35.5),
n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)),
.Names = c("date_time", "direction", "type", "speed", "n_vehicles"),
row.names = c(NA, -7L),
class = "data.frame")

mydf

and I need to get to this final result

mydf_final<-structure(list(date_time = structure(c(1508238000, 1508238000, 1508238000, 1508238000), class = 
c("POSIXct", "POSIXt"), tzone = ""),
type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", "heavy_duty", 
"motorcycle"), class = "factor"),
weighted_avg_speed = c(36.39029, 38.56521, 37.5, 36.08696),
n_vehicles = c(1153L,69L,45L,23L)),
.Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"),
row.names = c(NA, -4L),
class = "data.frame")

mydf_final


my question:
how to compute a weighted mean i.e. "weighted_avg_speed"
from "speed" (the values whose weighted mean is to be computed) and 
"n_vehicles" (the weights)
grouped by "date_time" and "type"?

to be noted the complication of the case "motorcycle" (not present in both 
directions)

any help for that?

thank you

max





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weighted average grouped by variables

2017-11-09 Thread Massimo Bressan
Hello 

an update about my question: I worked out the following solution (with the 
package "dplyr") 

library(dplyr) 

mydf%>% 
mutate(speed_vehicles=n_vehicles*mydf$speed) %>% 
group_by(date_time,type) %>% 
summarise( 
sum_n_times_speed=sum(speed_vehicles), 
n_vehicles=sum(n_vehicles), 
vel=sum(speed_vehicles)/sum(n_vehicles) 
) 


In fact I was hoping to manage everything in a "one-go": i.e. without the need 
to create the "intermediate" variable called "speed_vehicles" and with the use 
of the function weighted.mean() 

any hints for a different approach much appreciated 

thanks 



Da: "Massimo Bressan"  
A: "r-help"  
Inviato: Giovedì, 9 novembre 2017 12:20:52 
Oggetto: weighted average grouped by variables 

hi all 

I have this dataframe (created as a reproducible example) 

mydf<-structure(list(date_time = structure(c(1508238000, 1508238000, 
1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class = 
c("POSIXct", "POSIXt"), tzone = ""), 
direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), 
class = "factor"), 
type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car", "light_duty", 
"heavy_duty", "motorcycle"), class = "factor"), 
avg_speed = c(41.1029082774049, 40.3, 40.3157894736842, 
36.0869565217391, 33.4065155807365, 37.6, 35.5), 
n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)), 
.Names = c("date_time", "direction", "type", "speed", "n_vehicles"), 
row.names = c(NA, -7L), 
class = "data.frame") 

mydf 

and I need to get to this final result 

mydf_final<-structure(list(date_time = structure(c(1508238000, 1508238000, 
1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""), 
type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", 
"heavy_duty", "motorcycle"), class = "factor"), 
weighted_avg_speed = c(36.39029, 38.56521, 37.5, 36.08696), 
n_vehicles = c(1153L,69L,45L,23L)), 
.Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"), 
row.names = c(NA, -4L), 
class = "data.frame") 

mydf_final 


my question: 
how to compute a weighted mean i.e. "weighted_avg_speed" 
from "speed" (the values whose weighted mean is to be computed) and 
"n_vehicles" (the weights) 
grouped by "date_time" and "type"? 

to be noted the complication of the case "motorcycle" (not present in both 
directions) 

any help for that? 

thank you 

max 



-- 

 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bres...@arpa.veneto.it 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] weighted average grouped by variables

2017-11-09 Thread Massimo Bressan
hi all 

I have this dataframe (created as a reproducible example) 

mydf<-structure(list(date_time = structure(c(1508238000, 1508238000, 
1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class = 
c("POSIXct", "POSIXt"), tzone = ""), 
direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), 
class = "factor"), 
type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car", "light_duty", 
"heavy_duty", "motorcycle"), class = "factor"), 
avg_speed = c(41.1029082774049, 40.3, 40.3157894736842, 
36.0869565217391, 33.4065155807365, 37.6, 35.5), 
n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)), 
.Names = c("date_time", "direction", "type", "speed", "n_vehicles"), 
row.names = c(NA, -7L), 
class = "data.frame") 

mydf 

and I need to get to this final result 

mydf_final<-structure(list(date_time = structure(c(1508238000, 1508238000, 
1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone = ""), 
type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", 
"heavy_duty", "motorcycle"), class = "factor"), 
weighted_avg_speed = c(36.39029, 38.56521, 37.5, 36.08696), 
n_vehicles = c(1153L,69L,45L,23L)), 
.Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"), 
row.names = c(NA, -4L), 
class = "data.frame") 

mydf_final 


my question: 
how to compute a weighted mean i.e. "weighted_avg_speed" 
from "speed" (the values whose weighted mean is to be computed) and 
"n_vehicles" (the weights) 
grouped by "date_time" and "type"? 

to be noted the complication of the case "motorcycle" (not present in both 
directions) 

any help for that? 

thank you 

max 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] weighted average

2013-07-22 Thread Robert Lynch
I am trying to compute GPA from class grades(which have been normallized)
I have for example the following matrix

Master =
SIDB2AB2BB2C   C2A C2BC2CC118AC118B C118C
0010.010.5  -0.41.2   -1.8 0.3  -0.3   0.4
  0.5
0020.010.5  -0.40.5   -0.4 1.2  -1.8   0.3
  -0.3
0030.040.05 0.5-0.4 - 0.5 0.4  -1.2   1.8
0.3
etc

Where each column has a zero mean and a standard deviation of 1.  I want to
calculate a weighted average for each row(student ID) that takes into
account that
B2A, C118A, C118B, and C118C are all 4 unit classes, and the rest, B2B,
B2C, C2A,C2B,C2C are 5 unit classes

I have tried
Units-c(4,5,5,5,5,5,4,4,4)
Master$zGPA -weighted.means(Master[,2:10],Units)

But that gets me one number and not a vector.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted average

2013-07-22 Thread David Winsemius

On Jul 22, 2013, at 3:12 PM, Robert Lynch wrote:

 I am trying to compute GPA from class grades(which have been normallized)
 I have for example the following matrix
 
 Master =
 SIDB2AB2BB2C   C2A C2BC2CC118AC118B C118C
 0010.010.5  -0.41.2   -1.8 0.3  -0.3   0.4
  0.5
 0020.010.5  -0.40.5   -0.4 1.2  -1.8   0.3
  -0.3
 0030.040.05 0.5-0.4 - 0.5 0.4  -1.2   1.8
0.3
 etc
 
 Where each column has a zero mean and a standard deviation of 1.  I want to
 calculate a weighted average for each row(student ID) that takes into
 account that
 B2A, C118A, C118B, and C118C are all 4 unit classes, and the rest, B2B,
 B2C, C2A,C2B,C2C are 5 unit classes
 
 I have tried
 Units-c(4,5,5,5,5,5,4,4,4)
 Master$zGPA -weighted.means(Master[,2:10],Units)
 
 But that gets me one number and not a vector.

Perhaps something along lines of 

 Master$zGPA -sapply( weighted.means(Master[,2:10], weighted.means, 
weghts=Units)

(Untested in absence of data or name of package from which function is loaded.)

 ?weighted.means
No documentation for ‘weighted.means’ in specified packages and libraries:
you could try ‘??weighted.means’

--- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted average

2013-07-22 Thread arun
Hi,
May be this helps:
Master-read.table(text=
SID    B2A    B2B    B2C  C2A    C2B    C2C    C118A    C118B C118C
001    0.01    0.5  -0.4    1.2  -1.8    0.3  -0.3  0.4   0.5
002    0.01    0.5  -0.4    0.5  -0.4    1.2  -1.8  0.3  -0.3
003    0.04    0.05    0.5    -0.4    -0.5    0.4  -1.2  1.8 0.3
,sep=,header=TRUE)
 library(matrixStats)


  Master$zGPA-rowWeightedMeans(as.matrix(Master[,-1]),Units)
 Master
#  SID  B2A  B2B  B2C  C2A  C2B C2C C118A C118B C118C zGPA
#1   1 0.01 0.50 -0.4  1.2 -1.8 0.3  -0.3   0.4   0.5  0.035121951
#2   2 0.01 0.50 -0.4  0.5 -0.4 1.2  -1.8   0.3  -0.3 -0.003902439
#3   3 0.04 0.05  0.5 -0.4 -0.5 0.4  -1.2   1.8   0.3  0.097804878
A.K.

- Original Message -
From: Robert Lynch robert.b.ly...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, July 22, 2013 6:12 PM
Subject: [R] weighted average

I am trying to compute GPA from class grades(which have been normallized)
I have for example the following matrix

Master =
SID    B2A    B2B    B2C   C2A     C2B    C2C    C118A    C118B     C118C
001    0.01    0.5      -0.4    1.2       -1.8     0.3      -0.3       0.4
          0.5
002    0.01    0.5      -0.4    0.5       -0.4     1.2      -1.8       0.3
          -0.3
003    0.04    0.05     0.5    -0.4     - 0.5     0.4      -1.2       1.8
        0.3
etc

Where each column has a zero mean and a standard deviation of 1.  I want to
calculate a weighted average for each row(student ID) that takes into
account that
B2A, C118A, C118B, and C118C are all 4 unit classes, and the rest, B2B,
B2C, C2A,C2B,C2C are 5 unit classes

I have tried
Units-c(4,5,5,5,5,5,4,4,4)
Master$zGPA -weighted.means(Master[,2:10],Units)

But that gets me one number and not a vector.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted average

2013-07-22 Thread arun


In that case:
Master1- Master

Master1$zGPA-sapply(seq_len(nrow(Master1[,-1])),function(i) 
weighted.mean(Master1[i,-1],Units))

Master1$zGPA
#[1]  0.035121951 -0.003902439  0.097804878
all.equal(Master,Master1)
#[1] TRUE

A.K.



From: Robert Lynch robert.b.ly...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Monday, July 22, 2013 6:35 PM
Subject: Re: [R] weighted average



weighted.mean is the function.  My apologies for appending an s





On Mon, Jul 22, 2013 at 3:31 PM, arun smartpink...@yahoo.com wrote:

Hi,

I couldn't find the function `weighted.means` using ?weighted.means or 
??weighted.means.  It would be useful to provide the library information.




- Original Message -
From: arun smartpink...@yahoo.com
To: Robert Lynch robert.b.ly...@gmail.com
Cc: R help r-help@r-project.org
Sent: Monday, July 22, 2013 6:26 PM
Subject: Re: [R] weighted average

Hi,
May be this helps:
Master-read.table(text=
SID    B2A    B2B    B2C  C2A    C2B    C2C    C118A    C118B C118C
001    0.01    0.5  -0.4    1.2  -1.8    0.3  -0.3  0.4   0.5
002    0.01    0.5  -0.4    0.5  -0.4    1.2  -1.8  0.3  -0.3
003    0.04    0.05    0.5    -0.4    -0.5    0.4  -1.2  1.8 0.3
,sep=,header=TRUE)
 library(matrixStats)


  Master$zGPA-rowWeightedMeans(as.matrix(Master[,-1]),Units)
 Master
#  SID  B2A  B2B  B2C  C2A  C2B C2C C118A C118B C118C zGPA
#1   1 0.01 0.50 -0.4  1.2 -1.8 0.3  -0.3   0.4   0.5  0.035121951
#2   2 0.01 0.50 -0.4  0.5 -0.4 1.2  -1.8   0.3  -0.3 -0.003902439
#3   3 0.04 0.05  0.5 -0.4 -0.5 0.4  -1.2   1.8   0.3  0.097804878
A.K.

- Original Message -
From: Robert Lynch robert.b.ly...@gmail.com
To: r-help@r-project.org
Cc:
Sent: Monday, July 22, 2013 6:12 PM
Subject: [R] weighted average

I am trying to compute GPA from class grades(which have been normallized)
I have for example the following matrix

Master =
SID    B2A    B2B    B2C   C2A     C2B    C2C    C118A    C118B     C118C
001    0.01    0.5      -0.4    1.2       -1.8     0.3      -0.3       0.4
          0.5
002    0.01    0.5      -0.4    0.5       -0.4     1.2      -1.8       0.3
          -0.3
003    0.04    0.05     0.5    -0.4     - 0.5     0.4      -1.2       1.8
        0.3
etc

Where each column has a zero mean and a standard deviation of 1.  I want to
calculate a weighted average for each row(student ID) that takes into
account that
B2A, C118A, C118B, and C118C are all 4 unit classes, and the rest, B2B,
B2C, C2A,C2B,C2C are 5 unit classes

I have tried
Units-c(4,5,5,5,5,5,4,4,4)
Master$zGPA -weighted.means(Master[,2:10],Units)

But that gets me one number and not a vector.

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted average

2013-07-22 Thread David Winsemius

On Jul 22, 2013, at 3:23 PM, David Winsemius wrote:

 
 On Jul 22, 2013, at 3:12 PM, Robert Lynch wrote:
 
 I am trying to compute GPA from class grades(which have been normallized)
 I have for example the following matrix
 
 Master =
 SIDB2AB2BB2C   C2A C2BC2CC118AC118B C118C
 0010.010.5  -0.41.2   -1.8 0.3  -0.3   0.4
 0.5
 0020.010.5  -0.40.5   -0.4 1.2  -1.8   0.3
 -0.3
 0030.040.05 0.5-0.4 - 0.5 0.4  -1.2   1.8
   0.3
 etc
 
 Where each column has a zero mean and a standard deviation of 1.  I want to
 calculate a weighted average for each row(student ID) that takes into
 account that
 B2A, C118A, C118B, and C118C are all 4 unit classes, and the rest, B2B,
 B2C, C2A,C2B,C2C are 5 unit classes
 
 I have tried
 Units-c(4,5,5,5,5,5,4,4,4)
 Master$zGPA -weighted.means(Master[,2:10],Units)
 
 But that gets me one number and not a vector.
 
 Perhaps something along lines of 
 
 Master$zGPA -sapply( weighted.means(Master[,2:10], weighted.means, 
 weghts=Units)
 
 (Untested in absence of data or name of package from which function is 
 loaded.)
 
 ?weighted.means
 No documentation for ‘weighted.means’ in specified packages and libraries:
 you could try ‘??weighted.means’

If you are using weighted.mean and want this applied by row (one row per 
student I guess) , then probably this would be better:

Master$zGPA -  apply( Master[,2:10],  1, weighted.means, w=Units)

-- 
David.

 
 --- 
 David Winsemius
 Alameda, CA, USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted Average on More than One Variable in Data Frame

2011-09-21 Thread StellathePug
Dear R Users,
I have looked for a solution to the following problem and I have not been
able to find it on the archive, through Google or in the R documentation.

I have a data frame, say df, which has 4 variables, one of which I would
like to use as a grouping variable (g), another one that I would like to use
for my weights (w) The other two variables are variables (x1 and x2) for
which I would like to compute the weighted average by group.

df - data.frame(x1 = c(15, 12,  3, 10, 10, 14, 12), 
   x2 = c(10, 11, 16,  9,   7, 17, 18),
g = c(  1,   1,  1,  2,   2,   3,  3), 
w = c( 2,   3,  1,  5,   5,   2,  5)) 

wx1 - sapply(split(df, df$g), function(x){weighted.mean(x$x1, x$w)}) 
wx2 - sapply(split(df, df$g), function(x){weighted.mean(x$x2, x$w)})

The above code works, the result is:
 wx1
   123 
11.5 10.0 12.57143 
 wx2
   123 
11.5  8.0 17.71429

But is there not a more elegant way of acting on x1 and x2 simultaneously?
Something along the lines of

wdf - sapply(split(df, df$g), function(x){weighted.mean(df, x$w)})

which is wrong since df has two columns, while w only has one. I suppose,
one could write a loop but that strikes me as being highly inefficient.

Thank you very much for your help!
Rita

--
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-Average-on-More-than-One-Variable-in-Data-Frame-tp3830922p3830922.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted Average on More than One Variable in Data Frame

2011-09-21 Thread Jean V Adams
Try this

sapply(split(df, df$g), function(x) apply(x[, 1:2], 2, weighted.mean, 
x$w))

Jean


StellathePug wrote on 09/21/2011 01:15:33 PM:
 
 Dear R Users,
 I have looked for a solution to the following problem and I have not 
been
 able to find it on the archive, through Google or in the R 
documentation.
 
 I have a data frame, say df, which has 4 variables, one of which I would
 like to use as a grouping variable (g), another one that I would like to 
use
 for my weights (w) The other two variables are variables (x1 and x2) for
 which I would like to compute the weighted average by group.
 
 df - data.frame(x1 = c(15, 12,  3, 10, 10, 14, 12), 
x2 = c(10, 11, 16,  9,   7, 17, 18),
 g = c(  1,   1,  1,  2,   2,   3,  3), 
 w = c( 2,   3,  1,  5,   5,   2,  5)) 
 
 wx1 - sapply(split(df, df$g), function(x){weighted.mean(x$x1, x$w)}) 
 wx2 - sapply(split(df, df$g), function(x){weighted.mean(x$x2, x$w)})
 
 The above code works, the result is:
  wx1
123 
 11.5 10.0 12.57143 
  wx2
123 
 11.5  8.0 17.71429
 
 But is there not a more elegant way of acting on x1 and x2 
simultaneously?
 Something along the lines of
 
 wdf - sapply(split(df, df$g), function(x){weighted.mean(df, x$w)})
 
 which is wrong since df has two columns, while w only has one. I 
suppose,
 one could write a loop but that strikes me as being highly inefficient.
 
 Thank you very much for your help!
 Rita
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weighted Average on More than One Variable in Data Frame

2011-09-21 Thread StellathePug
Thanks Jean, that worked perfectly!
Try this

sapply(split(df, df$g), function(x) apply(x[, 1:2], 2, weighted.mean, 
x$w))

Jean


StellathePug wrote on 09/21/2011 01:15:33 PM:
 
 I have a data frame, say df, which has 4 variables, one of which I would
 like to use as a grouping variable (g), another one that I would like to
 use
 for my weights (w) The other two variables are variables (x1 and x2) for
 which I would like to compute the weighted average by group.
 
 df - data.frame(x1 = c(15, 12,  3, 10, 10, 14, 12), 
x2 = c(10, 11, 16,  9,   7, 17, 18),
 g = c(  1,   1,  1,  2,   2,   3,  3), 
 w = c( 2,   3,  1,  5,   5,   2,  5)) 
 
 wx1 - sapply(split(df, df$g), function(x){weighted.mean(x$x1, x$w)}) 
 wx2 - sapply(split(df, df$g), function(x){weighted.mean(x$x2, x$w)})
 
 The above code works, the result is:
  wx1
123 
 11.5 10.0 12.57143 
  wx2
123 
 11.5  8.0 17.71429
 
 But is there not a more elegant way of acting on x1 and x2 
simultaneously?
 Something along the lines of
 
 wdf - sapply(split(df, df$g), function(x){weighted.mean(df, x$w)})
 
 which is wrong since df has two columns, while w only has one. I 
suppose, one could write a loop but that strikes me as being highly
inefficient.


--
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-Average-on-More-than-One-Variable-in-Data-Frame-tp3830922p3831611.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weighted Average application on Summary Dataset

2010-06-13 Thread RaoulD

Hi,

I have 2 huge datasets - May and Jun - a miniscule sample of one is given
below. I am trying to do 2 things with these datasets. I need to verify if
the weighted average of variable A for a Reason in Jun is same/different
from the same for May. To do this I am first computing the weighted average
for each SubReason using a function I wrote. 

Where I need help is applying the function on both the datasets to arrive at
weighted averages for each SubReason. Then, I would like to know what the
best way would be, to compare the weighted average for a sub reason across 2
datasets to be able to state that there is a difference - t-test,ANOVA?
Would greatly appreciate any help!! The function I wrote for weighted
average computation is given below the dataset.

One of the datasets:

Reason  SubReasonA  N
A  SR11115  29
B  SR2734   24
B  SR21054  31
A  Sr1600   43
A  SR31033  60
A  Sr11163  30
B  SR4732   43
B  SR4988   70
A  SR3569   25
B  SR41073  65

Output I require:
R   SR  WA_A   N (Sum of N)
A   SR1912.0098  102
SR3896.5294118   85
B   SR2914.3636364   55
SR4957.1966292   178
(Weighted Average 
of A for N weights)

# FUNCTION TO CALCULATE THE WEIGHTED AVERAGE FOR A WEIGHTED BY N   
WA-function(A,N) {
 sp_A-c(A %*% N)
 sum_N-sum(N)
 WA-sp_A/sum_N   
 return(WA)  
 }

Thanks in advance!
Raoul




-- 
View this message in context: 
http://r.789695.n4.nabble.com/Weighted-Average-application-on-Summary-Dataset-tp2253239p2253239.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.