[julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-03 Thread Cedric St-Jean
Something like using RDatasets df = dataset("datasets", "iris") df[:cumulative_PetalLength] = 0.0 by(df, :Species) do sub_df sub_df[:cumulative_PetalLength] = cumsum(sub_df[:PetalLength]) sub_df end though I hope someone can provide a more elegant solution. `sub_df` a SubDataFrame, and

[julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Ben Southwood
Thanks Cedric, that worked very well. I'm having a little trouble following the documentation as to how the "by ... do ..." structure actually works. Would you mind explaining what the code is doing? On Tuesday, May 3, 2016 at 10:07:10 PM UTC-4, Cedric St-Jean wrote: > > Something like > > us

[julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Cedric St-Jean
"Do blocks" are one of my favourite things about Julia, they're explained in the docs . Basically it's just a convenient way of defining and passing a function (the code that comes after `do`) to

Re: [julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Tom Short
Here's another way with DataFramesMeta [1]: using DataFrames, DataFramesMeta, RDatasets df = dataset("datasets", "iris")@transform(groupby(df, :Species), cs = cumsum(:PetalLength)) ​ [1] https://github.com/JuliaStats/DataFramesMeta.jl/ On Wed, May 4, 2016 at 8:09 AM, Cedric St-Jean wrote: >

Re: [julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Cedric St-Jean
That's way better, thank you! I never thought I'd say this, but I miss pandas. I could write df['cs'] = df.groupby('PetalLength').transform(cumsum) That's not possible in Julia because DataFrames don't have a row index. On Wednesday, May 4, 2016 at 9:04:21 AM UTC-4, tshort wrote: > > Here's ano

Re: [julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Tom Short
The closest I could come to your pandas approach is with the following. DataFrames don't have an index, but GroupedDataFrames do. I created an `ave` function that does something like the R function with the same name: using DataFrames, DataFramesMeta, RDatasets df = dataset("datasets", "iris") df

Re: [julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Sam Urmy
@linq df |> groupby(:Species) |> transform(cs = cumsum(:PetalLength) On Wednesday, May 4, 2016 at 9:23:26 AM UTC-4, Cedric St-Jean wrote: > > That's way better, thank you! > > I never thought I'd say this, but I miss pandas. I could write > > df['cs'] = df.groupby('PetalLength').transform(cumsu

Re: [julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Sam Urmy
df = @linq df |> groupby(:PetalLength) |> transform(cs = cumsum(:PetalLength)) You can also use the @linq macro to pipe the output from one operation to the next, which often reads more clearly than nesting the function calls and is a little closer to the Pandas syntax. On Wednesday, Ma

Re: [julia-users] Re: Newbie question. Need help with grouping dataframes, cumulative sums and plotting.

2016-05-04 Thread Ben Southwood
Thanks for all the help! On Wednesday, May 4, 2016 at 1:32:19 PM UTC-4, Sam Urmy wrote: > > df = @linq df |> > groupby(:PetalLength) |> > transform(cs = cumsum(:PetalLength)) > > You can also use the @linq macro to pipe the output from one operation to > the next, which often reads more c