Fokko commented on code in PR #1689:
URL: https://github.com/apache/iceberg-python/pull/1689#discussion_r1965609760
##########
pyiceberg/expressions/parser.py:
##########
@@ -90,13 +89,7 @@
@column.set_parse_action
def _(result: ParseResults) -> Reference:
if len(result.column) > 1:
- deprecation_message(
- deprecated_in="0.8.0",
- removed_in="0.9.0",
- help_message="Parsing expressions with table name is deprecated.
Only provide field names in the row_filter.",
- )
- # TODO: Once this is removed, we will no longer take just the last index
of parsed column result
- # And introduce support for parsing filter expressions with nested fields.
+ raise ValueError(f"Cannot parse expressions with table names or nested
fields, got: {".".join(result.column)}")
return Reference(result.column[-1])
Review Comment:
The biggest one I want to unlock here are nested fields.
> But I think we can just rely on PyIceberg not supporting ambiguous field
names for now as we introduce nested parsing support
I think the probability of having a collation between a nested field, and a
field with a dot is low. If this becomes a problem, we could always extend (in
a separate PR) the `Reference` class to allow passing in a tuple to make it
explicit.
Before doing this, we would also need to establish a way to express this in
the SQL-like syntax:
```sql
SELECT
location.x,
location.`field.with.dots`
FROM table
```
As an example from Databricks:
https://docs.databricks.com/aws/en/sql/language-manual/functions/dotsign
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]