> How many rows match your timestamp criteria?
Usually between 1,000 and 5,000. I agree that filtering could be way more
costly - it probably is. I just thought the expression is more complex and
worth explaining in more detail.
> Acero will not "fuse" the kernel and has no expression
How many rows match your timestamp criteria? In other words, how many rows
are you applying the function to? If there is an earlier exact match
filter on a timestamp that only matches 1 (or a few rows) then I are you
sure the expression evaluation (and not the filtering) is the costly spot?
>
Could you provide a script with which people can reproduce the problem for
the performance comparison? That way we can take a closer look.
On Mon, Aug 21, 2023 at 8:42 PM Spencer Nelson wrote:
> I'd like some help calibrating my expectations regarding acero
> performance. I'm finding that some
I'd like some help calibrating my expectations regarding acero performance.
I'm finding that some pretty naive numpy is about 10x faster than acero for
my use case.
I'm working with a table with 13,000,000 values. The values are angular
positions on the sky and times. I'd like to filter to a