Hi Julio,
you can use the Query package for the first part. To filter a DataFrame using some arbitrary julia expression, use something like this: using DataFrames, Query, NamedTuples q = @from i in df begin @where <filter expression> @select i end You can use any julia code in <filter expression>. Say your DataFrame has a column called price, then you could filter like this: @where i.price > 30. The i will be a NamedTuple type, so you can access the columns either by their name, or also by their index, e.g. @where i[1] > 30. if you want to filter by the first column. You can also just call some function that you have defined somewhere else: @where foo(i) As long as the <julia expression> returns a Bool, you should be good. If you run a query like this, q will be a standard julia iterator. Right now you can’t just say length(q), although that is something I should probably enable at some point (I’m also looking into the VB LINQ syntax that supports things like counting in the query expression itself). But you could materialize the query as an array and then look at the length of that: q = @from i in df begin @where <filter expression> @select i @collect end count = length(q) The @collect statement means that the query will return an array of a NamedTuple type (you can also materialize it into a whole bunch of other data structures, take a look at the documentation). Let me know if this works, or if you have any other feedback on Query.jl, I’m much in need of some user feedback for the package at this point. Best way for that is to open issues here https://github.com/davidanthoff/Query.jl. Best, David From: julia-users@googlegroups.com [mailto:julia-users@googlegroups.com] On Behalf Of Júlio Hoffimann Sent: Wednesday, October 12, 2016 5:20 PM To: julia-users <julia-users@googlegroups.com> Subject: [julia-users] Filtering DataFrame with a function Hi, I have a DataFrame for which I want to filter rows that match a given criteria. I don't have the number of columns beforehand, so I cannot explicitly list the criteria with the :symbol syntax or write down a fixed number of indices. Is there any way to filter with a lambda expression? Or even better, is there any efficient way to count the number of occurrences of a specific row of observations? -Júlio