Hi Raphael, Thanks for the info. Yes, makes sense now and the example in the PR is clear.
I had a question about the separation design however, doesn't that mean the Array is iterated through twice? - Once to determine what to filter out, and then a second time to actually perform the filtering, would this not be inefficient? How come there's no option to filter out on the first iteration rather than a 2-step process? Regards, Olo ________________________________ From: Raphael Taylor-Davies <[email protected]> Sent: 04 November 2022 02:05 To: [email protected] <[email protected]> Subject: Re: [RUST] Compute kernel filtering Hi Olo, Typically you will evaluate some predicate in order to obtain a BooleanArray, potentially using one of the comparison<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Fcomparison%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=If5U9ltVdnMcKDb5NKuCy9bcjk%2Bg1nWJc3Mpfyg5cbw%3D&reserved=0> kernels, and then pass this to the filter kernel to filter the actual values. This separation not only allows both kernels to be implemented more efficiently, but allows filtering an array with a predicate evaluated on a different array. I've created this PR<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Farrow-rs%2Fpull%2F3014&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=E%2BP4SYzTWqvs0eXpJxt3i1XV3xxfJhaDiT%2BesLgX%2Bmc%3D&reserved=0> to add an example doctest of this. Please let me know if anything isn't clear Kind Regards, Raphael Taylor-Davies On 04/11/2022 06:48, Olo Sawyerr wrote: Hi there, Hope you're well. I have a question about the arrow<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ADzef7HqEmsOGDjDRBv0ZAhD86nHO068oE%2FewsxWWPY%3D&reserved=0>::compute<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DTVQVNFxVjdkNjhNrybcV%2FulmoLGGe96ejViJAbDnUk%3D&reserved=0>::kernels<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OzchweJlbpEyBqy%2F5NBuvkWWpDqKQ8c0e8CN3JuorrU%3D&reserved=0>::filter<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Ffilter%2Ffn.filter.html%23&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=htFFG42e9sRpHAKYR0FXb8e2KE2%2BKzwz6bjA4R4DGRU%3D&reserved=0>::filter<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Ffilter%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9SpH0ldv1w3vLe5qQk3mgx2f52wu8J%2Bl7Im%2FU68o8Lg%3D&reserved=0>() function. The function takes a predicate which is a BooleanArray to do the filtering. How should this be used in practice? One usually doesn't know which item should be filtered out or not in advance before you actually do the filter, so won't know what the BooleanArray should contain. I would have thought that the predicate would be a function instead to test if each item should be filtered out or not. How is arrow<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ADzef7HqEmsOGDjDRBv0ZAhD86nHO068oE%2FewsxWWPY%3D&reserved=0>::compute<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=DTVQVNFxVjdkNjhNrybcV%2FulmoLGGe96ejViJAbDnUk%3D&reserved=0>::kernels<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Findex.html&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=OzchweJlbpEyBqy%2F5NBuvkWWpDqKQ8c0e8CN3JuorrU%3D&reserved=0>::filter<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.rs%2Farrow%2Flatest%2Farrow%2Fcompute%2Fkernels%2Ffilter%2Ffn.filter.html%23&data=05%7C01%7C%7C7aa4e73b0390462f0ff408dabe09199c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638031243542288038%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=htFFG42e9sRpHAKYR0FXb8e2KE2%2BKzwz6bjA4R4DGRU%3D&reserved=0>::filter() meant to be used in practice? Thanks.
