jorisvandenbossche commented on issue #9160:
URL: https://github.com/apache/arrow/issues/9160#issuecomment-758654770


   The problem is that a null is not equal to itself, so you can't filter nulls 
with an `==` equality check. 
   
   For the new dataset API, we are working on more powerful filter expressions, 
and you can already achieve this:
   
   ```
   In [21]: import pyarrow.dataset as ds
   
   In [22]: pq.read_table('data.parquet', 
filters=~ds.field("column").is_valid()).to_pandas()
   Out[22]: 
     column
   0   None
   1   None
   ``` 
   
   We should probably also add a `is_null()` method to make this case a bit 
more straightforward. 
   
   
   ---
   
   General note: we prefer the user mailing list for such questions, see 
https://arrow.apache.org/community/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to