Le mercredi 19 octobre 2016 à 13:51 -0700, Dean Schulze a écrit :
> I have a DataFrame
> 
> julia> df
> 252931×2 DataFrames.DataFrame
> │ Row    │ x    │ y       │
> ├────────┼──────┼─────────┤
> │ 1      │ 0    │ 30000   │
> │ 2      │ 0    │ 60000   │
> │ 3      │ 0    │ 124800  │
> │ 4      │ 0    │ 190000  │
> │ 5      │ 0    │ 200000  │
> │ 6      │ 0    │ 204800  │
> │ 7      │ 0    │ 224800  │
> │ 8      │ 0    │ 234800  │
> ⋮
> │ 252923 │ 4999 │ 3364800 │
> │ 252924 │ 4999 │ 3374800 │
> │ 252925 │ 4999 │ 3390000 │
> │ 252926 │ 4999 │ 3434800 │
> │ 252927 │ 4999 │ 3464800 │
> │ 252928 │ 4999 │ 3490000 │
> │ 252929 │ 4999 │ 3510000 │
> │ 252930 │ 4999 │ 3534800 │
> │ 252931 │ 4999 │ 3540000 │
> 
> 
> I need to work with it in sub-DataFrames due to its size.  I tried
> filtering using this syntax
> 
> df_missing_timestamps_normalized[1:10000, :(y < 100000)]
> 
> df_missing_timestamps_normalized[1:10000, :(y -> y < 100000)]
> 
> but they both throw errors.
> 
> How do I select sub-DataFrames based on a predicate?
Try with df[df[:y] .< 1000000, :].

You can also have a look at the DataFramesMeta.jl, Query.jl and
StructuredQueries.jl packages for nicer ways of working with data sets.


Regards

Reply via email to