Le mercredi 19 octobre 2016 à 13:51 -0700, Dean Schulze a écrit : > I have a DataFrame > > julia> df > 252931×2 DataFrames.DataFrame > │ Row │ x │ y │ > ├────────┼──────┼─────────┤ > │ 1 │ 0 │ 30000 │ > │ 2 │ 0 │ 60000 │ > │ 3 │ 0 │ 124800 │ > │ 4 │ 0 │ 190000 │ > │ 5 │ 0 │ 200000 │ > │ 6 │ 0 │ 204800 │ > │ 7 │ 0 │ 224800 │ > │ 8 │ 0 │ 234800 │ > ⋮ > │ 252923 │ 4999 │ 3364800 │ > │ 252924 │ 4999 │ 3374800 │ > │ 252925 │ 4999 │ 3390000 │ > │ 252926 │ 4999 │ 3434800 │ > │ 252927 │ 4999 │ 3464800 │ > │ 252928 │ 4999 │ 3490000 │ > │ 252929 │ 4999 │ 3510000 │ > │ 252930 │ 4999 │ 3534800 │ > │ 252931 │ 4999 │ 3540000 │ > > > I need to work with it in sub-DataFrames due to its size. I tried > filtering using this syntax > > df_missing_timestamps_normalized[1:10000, :(y < 100000)] > > df_missing_timestamps_normalized[1:10000, :(y -> y < 100000)] > > but they both throw errors. > > How do I select sub-DataFrames based on a predicate? Try with df[df[:y] .< 1000000, :].
You can also have a look at the DataFramesMeta.jl, Query.jl and StructuredQueries.jl packages for nicer ways of working with data sets. Regards