Re: [R] About data manipulation

2016-11-26 Thread jim holtman
just assign it to an object

x<- DT .


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sun, Nov 27, 2016 at 2:03 AM, lily li  wrote:

> Thanks Jim, this method is very convenient and is what I want. Could I
> know how to save the resulted dataframe? It printed in the console directly.
>
> On Sat, Nov 26, 2016 at 5:55 PM, jim holtman  wrote:
>
>> You did not provide any data, but I will take a stab at it using the
>> "dplyr" package
>>
>> library(dplyr)
>> DT %>%
>> group_by(month, id, note) %>%
>> summarise(avg = mean(total))
>>
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Sat, Nov 26, 2016 at 11:11 AM, lily li  wrote:
>>
>>> Hi R users,
>>>
>>> I'm trying to manipulate a dataframe and have some difficulties.
>>>
>>> The original dataset is like this:
>>>
>>> DF
>>> year   month   total   id note
>>> 2000 1 98GA   1
>>> 2001 1100   GA   1
>>> 2002 2 99GA   1
>>> 2002 2 80GB   1
>>> ...
>>> 2012 1 78GA   2
>>> ...
>>>
>>> The structure is like this: when year is between 2000-2005, note is 1;
>>> when
>>> year is between 2006-2010, note is 2; GA, GB, etc represent different
>>> groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
>>> I want to calculate one average value for each month in each time slice.
>>> For example, between 2000-2005, when note is 1, for GA, there is one
>>> value
>>> in month 1, one value in month 2, etc; for GB, there is one value in
>>> month
>>> 1, one value in month 2, between this time period. So later, there is no
>>> 'year' column, but other columns.
>>> I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
>>> id==GA==1)), but it did not give me the ideal dataframe. How to do
>>> then?
>>> Thanks for your help.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About data manipulation

2016-11-26 Thread lily li
Thanks Jim, this method is very convenient and is what I want. Could I know
how to save the resulted dataframe? It printed in the console directly.

On Sat, Nov 26, 2016 at 5:55 PM, jim holtman  wrote:

> You did not provide any data, but I will take a stab at it using the
> "dplyr" package
>
> library(dplyr)
> DT %>%
> group_by(month, id, note) %>%
> summarise(avg = mean(total))
>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Sat, Nov 26, 2016 at 11:11 AM, lily li  wrote:
>
>> Hi R users,
>>
>> I'm trying to manipulate a dataframe and have some difficulties.
>>
>> The original dataset is like this:
>>
>> DF
>> year   month   total   id note
>> 2000 1 98GA   1
>> 2001 1100   GA   1
>> 2002 2 99GA   1
>> 2002 2 80GB   1
>> ...
>> 2012 1 78GA   2
>> ...
>>
>> The structure is like this: when year is between 2000-2005, note is 1;
>> when
>> year is between 2006-2010, note is 2; GA, GB, etc represent different
>> groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
>> I want to calculate one average value for each month in each time slice.
>> For example, between 2000-2005, when note is 1, for GA, there is one value
>> in month 1, one value in month 2, etc; for GB, there is one value in month
>> 1, one value in month 2, between this time period. So later, there is no
>> 'year' column, but other columns.
>> I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
>> id==GA==1)), but it did not give me the ideal dataframe. How to do
>> then?
>> Thanks for your help.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Partial Fraction Decomposition

2016-11-26 Thread Tom Mosca
Hello Folks,

As a neophyte R user I frequently have questions, and I�m sorry to bother 
experienced users with what may appear to be trivial questions to which I 
should be able to find answers without help.  I�ve searched everywhere I know 
to look, and can�t find any reference to this question.  Perhaps I just haven't 
guessed a correct keyword for my searches.  I apologize in advance.

Given a rational expression P/Q with P and Q being polynomials that are prime 
in relation to each other, and with Q factored, is there an R function or 
package that will return the partial fraction decomposition of P/Q?

For example:
Given (3x^3+x^2-8x+6)/(x^2)(x-1)^2
Return 4/x + 6/x^2 � 1/(x-1) + 2/(x-1)^2

Thank you, Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merra2 files

2016-11-26 Thread Alemu Tadesse
Dear R users,

I am wondering if someone has a script to download and read Merra 2 files.

I really appreciate your help.

Best,

Alemu

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About data manipulation

2016-11-26 Thread jim holtman
You did not provide any data, but I will take a stab at it using the
"dplyr" package

library(dplyr)
DT %>%
group_by(month, id, note) %>%
summarise(avg = mean(total))



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sat, Nov 26, 2016 at 11:11 AM, lily li  wrote:

> Hi R users,
>
> I'm trying to manipulate a dataframe and have some difficulties.
>
> The original dataset is like this:
>
> DF
> year   month   total   id note
> 2000 1 98GA   1
> 2001 1100   GA   1
> 2002 2 99GA   1
> 2002 2 80GB   1
> ...
> 2012 1 78GA   2
> ...
>
> The structure is like this: when year is between 2000-2005, note is 1; when
> year is between 2006-2010, note is 2; GA, GB, etc represent different
> groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
> I want to calculate one average value for each month in each time slice.
> For example, between 2000-2005, when note is 1, for GA, there is one value
> in month 1, one value in month 2, etc; for GB, there is one value in month
> 1, one value in month 2, between this time period. So later, there is no
> 'year' column, but other columns.
> I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
> id==GA==1)), but it did not give me the ideal dataframe. How to do
> then?
> Thanks for your help.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About data manipulation

2016-11-26 Thread Bert Gunter
A reproducible example was not provided, but I think what is wanted is
either ?tapply or ?ave; e.g.

within(DF, means <- ave(total, note, month, FUN = mean))


Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Nov 26, 2016 at 3:42 PM, P Tennant  wrote:
> Hi,
>
> It may help that:
>
> aggregate(DF$total, list(DF$note, DF$id, DF$month), mean)
>
> should give you means broken down by time slice (note), id and month. You
> could then subset means for GA or GB from the aggregated dataframe.
>
> Philip
>
> On 27/11/2016 3:11 AM, lily li wrote:
>>
>> Hi R users,
>>
>> I'm trying to manipulate a dataframe and have some difficulties.
>>
>> The original dataset is like this:
>>
>> DF
>> year   month   total   id note
>> 2000 1 98GA   1
>> 2001 1100   GA   1
>> 2002 2 99GA   1
>> 2002 2 80GB   1
>> ...
>> 2012 1 78GA   2
>> ...
>>
>> The structure is like this: when year is between 2000-2005, note is 1;
>> when
>> year is between 2006-2010, note is 2; GA, GB, etc represent different
>> groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
>> I want to calculate one average value for each month in each time slice.
>> For example, between 2000-2005, when note is 1, for GA, there is one value
>> in month 1, one value in month 2, etc; for GB, there is one value in month
>> 1, one value in month 2, between this time period. So later, there is no
>> 'year' column, but other columns.
>> I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
>> id==GA==1)), but it did not give me the ideal dataframe. How to do
>> then?
>> Thanks for your help.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About data manipulation

2016-11-26 Thread P Tennant

Hi,

It may help that:

aggregate(DF$total, list(DF$note, DF$id, DF$month), mean)

should give you means broken down by time slice (note), id and month. 
You could then subset means for GA or GB from the aggregated dataframe.


Philip

On 27/11/2016 3:11 AM, lily li wrote:

Hi R users,

I'm trying to manipulate a dataframe and have some difficulties.

The original dataset is like this:

DF
year   month   total   id note
2000 1 98GA   1
2001 1100   GA   1
2002 2 99GA   1
2002 2 80GB   1
...
2012 1 78GA   2
...

The structure is like this: when year is between 2000-2005, note is 1; when
year is between 2006-2010, note is 2; GA, GB, etc represent different
groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
I want to calculate one average value for each month in each time slice.
For example, between 2000-2005, when note is 1, for GA, there is one value
in month 1, one value in month 2, etc; for GB, there is one value in month
1, one value in month 2, between this time period. So later, there is no
'year' column, but other columns.
I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
id==GA==1)), but it did not give me the ideal dataframe. How to do
then?
Thanks for your help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Dos pequeños códigos casi idénticos y sólo funciona el primero

2016-11-26 Thread Carlos J. Gil Bellosta
Hola, ¿qué tal?

Respondo debajo:

El 25 de noviembre de 2016, 10:21, Olivier Nuñez  escribió:

> Creo que el "by" sobra, o me perdí algo?


Bueno, en el código original había un

all(envio == T)

dentro de un grupo "by". Entiendo que la lógica que seguía aplicaba solo si
todos los envíos por caso/empresa fuesen T. En el ejemplo mostrado daba
igual (¿accidente de los datos?); pero en el general, entiendo, no.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com


>
> DT.new=DT[!(envio=="TRUE" & coche=="B"),]
> DT.new[(envio=="FALSE" & coche=="B"),coche:="A"]
> DT.new
>caso empresa coche envio
> 1:1  E1 A  TRUE
> 2:1  E1 U  TRUE
> 3:2  E2 W FALSE
> 4:2  E2 A FALSE
>
>
> Un saludo. Olivier
>
> - Mensaje original -
> De: "Carlos J. Gil Bellosta" 
> Para: "Francisco Javier" 
> CC: r-help-es@r-project.org
> Enviados: Jueves, 24 de Noviembre 2016 16:44:16
> Asunto: Re: [R-es]  Dos pequeños códigos casi idénticos y sólo
> funciona el primero
>
> Hola, ¿qué tal?
>
> ¿Has pensado en la posibilidad de que tu código (el que funciona) funcione
> solo "de casualidad" y porque tus datos son así y no de otra manera? Tengo
> la sensación de que sí.
>
> La lógica es endiablada, y creo que se entiende mejor (y obtienes el mismo
> resultado) si haces:
>
> DT[, all.true := all(envio == "TRUE"), by = list(caso, empresa)]
> DT <- DT[!(all.true & coche == "B"),]
> DT[, all.true := NULL]
> DT$coche[DT$coche == "B"] <- "A"
> DT
>
> Un saludo,
>
> Carlos J. Gil Bellosta
> http://www.datanalytics.com
>
> El 24 de noviembre de 2016, 16:21, Francisco Javier <
> iterado...@hotmail.com>
> escribió:
>
> > Buenas tardes a todos,
> >
> > He adaptado una pregunta realizada en otro foro respecto de un caso que
> me
> > interesa resolver. Sea el data.table:
> >
> > DT <- data.table(caso = rep(1:2, c(3, 2)),  empresa = factor(rep(c("E1",
> > "E2"), c(3, 2))),
> >   coche = factor(c('A', 'B', 'U', 'W', 'B')),  envio = factor(rep(c(T,
> F),
> > c(3, 2
> >
> >
> > En el siguiente codigo, segun la dupla (caso, empresa), se eliminan las
> > filas coche="B" si envio=T, y se cambia "B" por "A" si envio = F.
> >
> > DTnew <- DT[,##  CODIGO QUE SÍ FUNCIONA
> >if (all(envio == T))  list(coche = coche[which(coche != "B")])
> >else  list(coche),
> > by = list(caso, empresa)][, coche :=  as.factor(ifelse(coche == "B", "A",
> > as.character(coche))) ]
> >
> > caso   empresa coche
> > 1:   1  E1A
> > 2:   1  E1U
> > 3:   2  E2   W
> > 4:   2  E2A
> >
> >
> > Sin embargo, el siguiente código (casi identico) NO funciona:
> >
> > DTnew <- DT[,
> >if (all(envio == T))  list(coche = coche[which(coche != "B")])
> >else  list(coche = as.factor(ifelse(coche == "B", "A",
> > as.character(coche,
> > by = list(caso, empresa)]
> >
> > caso   empresa coche
> > 1:   1  E1A
> > 2:   1  E1U
> > 3:   2  E2B
> > 4:   2  E2A
> >
> >
> > ¿Alguién podría decirme como modificarlo para que sí funcione? Muchas
> > gracias por cualquier ayuda.
> >
> > [[alternative HTML version deleted]]
> >
> >
> > ___
> > R-help-es mailing list
> > R-help-es@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-help-es
> >
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] About data manipulation

2016-11-26 Thread lily li
Hi R users,

I'm trying to manipulate a dataframe and have some difficulties.

The original dataset is like this:

DF
year   month   total   id note
2000 1 98GA   1
2001 1100   GA   1
2002 2 99GA   1
2002 2 80GB   1
...
2012 1 78GA   2
...

The structure is like this: when year is between 2000-2005, note is 1; when
year is between 2006-2010, note is 2; GA, GB, etc represent different
groups, but they all have years 2000-2005, 2006-2010, 2011-2015.
I want to calculate one average value for each month in each time slice.
For example, between 2000-2005, when note is 1, for GA, there is one value
in month 1, one value in month 2, etc; for GB, there is one value in month
1, one value in month 2, between this time period. So later, there is no
'year' column, but other columns.
I tried the script: DF_GA = aggregate(total~year+month,data=subset(DF,
id==GA==1)), but it did not give me the ideal dataframe. How to do
then?
Thanks for your help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.