Re: acero speed versus numpy

2023-08-22 Thread Spencer Nelson
> How many rows match your timestamp criteria? Usually between 1,000 and 5,000. I agree that filtering could be way more costly - it probably is. I just thought the expression is more complex and worth explaining in more detail. > Acero will not "fuse" the kernel and has no expression

Re: acero speed versus numpy

2023-08-22 Thread Weston Pace
How many rows match your timestamp criteria? In other words, how many rows are you applying the function to? If there is an earlier exact match filter on a timestamp that only matches 1 (or a few rows) then I are you sure the expression evaluation (and not the filtering) is the costly spot? >

Re: acero speed versus numpy

2023-08-21 Thread Chak-Pong Chung
Could you provide a script with which people can reproduce the problem for the performance comparison? That way we can take a closer look. On Mon, Aug 21, 2023 at 8:42 PM Spencer Nelson wrote: > I'd like some help calibrating my expectations regarding acero > performance. I'm finding that some

acero speed versus numpy

2023-08-21 Thread Spencer Nelson
I'd like some help calibrating my expectations regarding acero performance. I'm finding that some pretty naive numpy is about 10x faster than acero for my use case. I'm working with a table with 13,000,000 values. The values are angular positions on the sky and times. I'd like to filter to a