Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-16 Thread via GitHub


liurenjie1024 commented on PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#issuecomment-2059024512

   Close by #334 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-16 Thread via GitHub


liurenjie1024 closed pull request #320: Add `BoundPredicateVisitor` trait
URL: https://github.com/apache/iceberg-rust/pull/320


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-16 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1567303582


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   I'm also +1 for the original approach after comparing these two approaches.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-16 Thread via GitHub


Fokko commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1567138774


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   I'm not sold on the iterator idea either, it looks very fancy, but iterating 
over a set feels awkward to me. 
   
   > If you're not happy with this approach @Fokko, I'll wait until the three 
of you are in agreement and fix up to whatever you all decide on. I'm happy 
with the approach as-is after switching my original design to the approach 
proposed by @liurenjie1024 and @marvinlanhenke.
   
   Well, I'm not the one to decide here, just weighing in my opinion :D I have 
a strong preference for the original approach as suggested in 
https://github.com/apache/iceberg-rust/pull/334 I think this leverages the 
visitor as much as possible, and also makes sure that all the operations are 
covered.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-16 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566902353


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   My preference is the original design, which is  (confusingly) in the 
"alternative" PRs 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-16 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566871505


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   Ok - here's another PR with the original design. I'll let you guys choose 
which one to merge :-) https://github.com/apache/iceberg-rust/pull/334. 
   
   Here's how the inclusive projection would end up looking with this design 
too: https://github.com/apache/iceberg-rust/pull/335



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566618135


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   I think generating an iterator for hashset will not introduce allocation, 
but you are right that user may need to iterating the whole iterator to 
implement their logic. Method in this pr seems kind of overdesign and 
counterintutive for user to implement. In this case I think your original 
design seems more intuitive to me, as @Fokko said adding operator maybe not too 
frequent. Sorry for the back and forth, I didn't expect this approach to be so 
complicated and introduce so much overhead. 
   
   cc @Fokko @marvinlanhenke @Xuanwo What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566362815


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(
+ self,
+op: PredicateOperator,
+reference: ,
+literal: Option,
+predicate: ,
+) -> Result;
+}
+
+/// Visits a [`BoundPredicate`] with the provided visitor,
+/// in post-order
+pub(crate) fn visit(
+visitor:  V,
+predicate: ,
+) -> Result {
+match predicate {
+BoundPredicate::AlwaysTrue => visitor.always_true(),
+BoundPredicate::AlwaysFalse => visitor.always_false(),
+BoundPredicate::And(expr) => {
+let [left_pred, right_pred] = expr.inputs();
+
+let left_result = visit(visitor, left_pred)?;
+let right_result = visit(visitor, right_pred)?;
+
+visitor.and(left_result, right_result)
+}
+BoundPredicate::Or(expr) => {
+let [left_pred, right_pred] = expr.inputs();
+
+let left_result = visit(visitor, left_pred)?;
+let right_result = visit(visitor, right_pred)?;
+
+visitor.or(left_result, right_result)
+}
+BoundPredicate::Not(expr) => {
+let [inner_pred] = expr.inputs();
+
+let inner_result = visit(visitor, inner_pred)?;
+
+visitor.not(inner_result)
+}
+BoundPredicate::Unary(expr) => visitor.op(expr.op(), expr.term(), 
None, predicate),

Review Comment:
   > I think this is the nicest way since it leverages the visitor pattern as 
much as possible, but I also saw the discussion of having this on the 
{unary,binary,set} level.
   
   If you're not happy with this approach @Fokko, I'll wait until the three of 
you are in agreement and fix up to whatever you all decide on. I'm happy with 
the approach as-is after switching my original design to the approach proposed 
by @liurenjie1024 and @marvinlanhenke.
   
   
   > With Python it is easy to have a cache-all where if nothing matches, it 
just goes to that one, and then we raise an exception that we've encountered an 
unknown predicate. This way it is possible to add new expressions, but I don't 
think that would happen too often.
   
   
   With the approach in the PR, if a new Op gets added, you have three options 
available when implementing `BoundPredicateVisitor` for a struct (if I remove 
the `#[non_exhaustive]` macro from `PredicateOperator`):
   
   1. Exhaustively specify a match arm for every Op. If another Op gets added, 
your BoundPredicateVisitor will no longer compile
   2. Specify a default match arm, and call `panic!()` in it. When a new Op 
gets added, your code still compiles but you get a runtime crash if the visitor 
encounters the new operator
   3. Specify a default match arm, and `return Err(NotImplemented)` in it. When 
a new Op gets added, your code still compiles but you will get a handleable 
error if you encounter the new operator - potentially 

Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566349390


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   Hi Renjie! Thanks for the comment. I'm not totally convinced with your 
suggestion - we're trying to get the best performance, but by passing Set 
literals through as an iterator, rather than passing a reference to the 
HashSet, it requires either collecting the iterator, resulting in an 
allocation, or iterating over potentially the whole iterator.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1566333861


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet

Review Comment:
   Will do



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1565780931


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(

Review Comment:
   Hi, @sdd How about we try rewrite it like 
[this](https://github.com/liurenjie1024/iceberg-rust/blob/renjie/bound_visitor/crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs)
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-15 Thread via GitHub


Fokko commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1565645549


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub(crate) enum OpLiteral<'a> {
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+/// A visitor for [`BoundPredicate`]s. Visits in post-order.
+pub trait BoundPredicateVisitor {
+/// The return type of this visitor
+type T;
+
+/// Called after an `AlwaysTrue` predicate is visited
+fn always_true( self) -> Result;
+
+/// Called after an `AlwaysFalse` predicate is visited
+fn always_false( self) -> Result;
+
+/// Called after an `And` predicate is visited
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after an `Or` predicate is visited
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+
+/// Called after a `Not` predicate is visited
+fn not( self, inner: Self::T) -> Result;
+
+/// Called after visiting a UnaryPredicate, BinaryPredicate,
+/// or SetPredicate. Passes the predicate's operator in all cases,
+/// as well as the term and literals in the case of binary and aet
+/// predicates.
+fn op(
+ self,
+op: PredicateOperator,
+reference: ,
+literal: Option,
+predicate: ,
+) -> Result;
+}
+
+/// Visits a [`BoundPredicate`] with the provided visitor,
+/// in post-order
+pub(crate) fn visit(
+visitor:  V,
+predicate: ,
+) -> Result {
+match predicate {
+BoundPredicate::AlwaysTrue => visitor.always_true(),
+BoundPredicate::AlwaysFalse => visitor.always_false(),
+BoundPredicate::And(expr) => {
+let [left_pred, right_pred] = expr.inputs();
+
+let left_result = visit(visitor, left_pred)?;
+let right_result = visit(visitor, right_pred)?;
+
+visitor.and(left_result, right_result)
+}
+BoundPredicate::Or(expr) => {
+let [left_pred, right_pred] = expr.inputs();
+
+let left_result = visit(visitor, left_pred)?;
+let right_result = visit(visitor, right_pred)?;
+
+visitor.or(left_result, right_result)
+}
+BoundPredicate::Not(expr) => {
+let [inner_pred] = expr.inputs();
+
+let inner_result = visit(visitor, inner_pred)?;
+
+visitor.not(inner_result)
+}
+BoundPredicate::Unary(expr) => visitor.op(expr.op(), expr.term(), 
None, predicate),

Review Comment:
   In PyIceberg we've split out the operator in a per-operator basis: 
https://github.com/apache/iceberg-python/blob/4b911057f13491f30f89f133544c063133133fa5/pyiceberg/expressions/visitors.py#L256
   
   The InclusiveMetricsEvaluator is already quite a beefy visitor, and by 
having it on this level, 95% will fall in a the `op` method: 
https://github.com/apache/iceberg-python/blob/4b911057f13491f30f89f133544c063133133fa5/pyiceberg/expressions/visitors.py#L1137
 
   I think this is the nicest way since it leverages the visitor pattern as 
much as possible, but I also saw the discussion of having this on the 
`{unary,binary,set}` level. 



##
crates/iceberg/src/expr/predicate.rs:
##
@@ -217,6 +249,10 @@ impl Display for SetExpression {
 /// Unbound predicate expression before binding to a schema.
 #[derive(Debug, PartialEq)]
 pub enum Predicate {
+/// AlwaysTrue predicate, for example, `TRUE`.
+AlwaysTrue,
+/// AlwaysFalse predicate, for example, `FALSE`.
+AlwaysFalse,

Review Comment:
   To add here, when we bind an `isNull` expression to a required field, then 
we know that that isn't possible :)



##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,317 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this 

Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-07 Thread via GitHub


sdd commented on PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#issuecomment-2041400457

   I've added a test for `visit_op`, as well as some doc comments.
   
   Also, based on our discussion in the comments, it seemed prudent to add the 
`non_exhaustive` marker to `PredicateOperator`? PTAL @liurenjie1024, 
@marvinlanhenke, @ZENOTME as I think this is pretty much ready now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-07 Thread via GitHub


marvinlanhenke commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554857281


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,260 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub enum OpLiteral<'a> {
+None,

Review Comment:
   sounds reasonable enough - thanks for taking a look into it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-07 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554847639


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,260 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub enum OpLiteral<'a> {
+None,

Review Comment:
   I've switched it to Option  
   
   Unfortunately regarding the Single / Set enums, yours is by-value and mine 
is by-reference. Mine only needs to be by-reference due to the use case, and 
refactoring it to by-value would incur unnecessary copies. Yours needs to be 
by-value I think, and a brief attempt to refactor to use my by-reference enum 
was more hassle than it was worth. Considering the simplicity of these enums I 
think it is acceptable to have a bit of duplication here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-06 Thread via GitHub


marvinlanhenke commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554654358


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,260 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub enum OpLiteral<'a> {
+None,
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;

Review Comment:
   I'm still unsure if we should provide those in the trait? I guess due to the 
same reasons @liurenjie1024 mentioned in the original discussion. Would `fn op` 
not be sufficient - and the rest can be handled in `fn visit` (it handles most 
of the traversal logic already)?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-06 Thread via GitHub


marvinlanhenke commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554653083


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,260 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub enum OpLiteral<'a> {
+None,

Review Comment:
   could we remove `None` and use Option instead?
   
   Also I had to create nearly the same enum for the project transform PR #264. 
Perhaps it would make sense to have this in value.rs as an enum?



##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,260 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub enum OpLiteral<'a> {
+None,
+Single(&'a Datum),
+Set(&'a FnvHashSet),
+}
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;

Review Comment:
   I'm still unsure if we should provide those in the trait? I guess due to the 
same reasons @liurenjie1024 mentioned in the original discussion. Would `fn op` 
not be sufficient - and the rest can be handled in `fn visit` (it handles most 
of the traversal logic already)?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-06 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554546747


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   That still wouldn't work without collecting the set into a vec first for set 
ops. I've updated the PR with an alternative based on an enum, see what you 
think.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554519498


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > I also like the idea of going further and having an Expression enum that 
contains the logical operators plus Predicate above, with Reference from the 
above being an enum over bound or unbound. This allows visitors to potentially 
work over both bound and unbound expressions.
   
   Hi, @sdd Would you mind to open an issue to raise a discussion about this 
idea? I think it looks promsing to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554519334


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > The proposed function signature fn visitor_op( self, op: 
, reference: ) would work for a unary operator.
   
   I think following signature would be applied to all operators:
   ```rust
fn visitor_op( self, op: , reference: 
, literals: &[Datum])
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554441701


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   I also like the idea of going further and having an `Expression` enum that 
contains the logical operators plus `Predicate` above, with `Reference` from 
the above being an enum over bound or unbound. This allows visitors to 
potentially work over both bound and unbound expressions.
   
   ```rust
   pub enum Expression {
   // The top level would be Expression rather than Predicate.
   // The logical operators would live at this level.
   
   // Since logical operators such as And / Or / Not
   // are common between bound and unbound, some visitors
   // such as rewrite_not can be applied to both bound and
   // unbound expressions, preventing duplication and increasing
   // flexibility
   
   /// An expression always evaluates to true.
   AlwaysTrue,
   
   /// An expression always evaluates to false.
   AlwaysFalse,
   
   // Here we use a struct variant to eliminate the need
   // for a generic `LogicalExpression`, which did not
   // add much, also allowing us to refer to lhs and rhs
   // by name
   
   /// An expression combined by `AND`, for example, `a > 10 AND b < 20`.
   And {
   lhs: Box,
   rhs: Box
   },
   
   /// An expression combined by `OR`, for example, `a > 10 OR b < 20`.
   Or {
   lhs: Box,
   rhs: Box
   },
   
   /// An expression combined by `NOT`, for example, `NOT (a > 10)`.
   Not(Box),
   
   /// A Predicate
   Predicate(Predicate),
   }
   
   pub enum Reference {
   // Only at the leaf level do we distinguish between
   // bound and unbound. This allows things like bind
   // and unbind to be implemented as a transforming visitor,
   // and avoids duplication of logic compared to having
   // Bound / Unbound being distinguished at a higher level
   // in the type hierarchy
   
   Bound(BoundReference),
   Unbound(UnboundReference),
   }
   
   pub struct UnboundReference {
   name: String,
   }
   
   pub struct BoundReference {
   column_name: String,
   field: NestedFieldRef,
   accessor: StructAccessor,
   }
   ``` 
   
   We could use type alises such as `type BoundExpression = Expression` and 
have `bind` return a `BoundExpression` to prevent bound expressions being used 
where unbound ones are required and vice versa. But I'm digressing very far 
from the original comment now, sorry!  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1554438555


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   I don't see how this will work with just `visitor_op`.
   
   The proposed function signature `fn visitor_op( self, op: 
, reference: )` would work for a unary 
operator.
   
   A binary operator would need `fn visitor_op( self, op: 
, reference: , literal: )`, and a set 
operator would need `fn visitor_op( self, op: , 
reference: , literals: )`.
   
   To resolve this, you'd need the visitor to have `fn unary_op(...)`, `fn 
binary_op(...)`, and `fn set_op(...)`.
   
   Is that what we want? 
   
   Personally I think this is another example of where the current type 
hierarchy feels awkward - there are other places where we bump into these type 
hierarchy mismatches, such as needing a default leg in a match on operator to 
handle operators that will never show up, such as binary ops in a unary 
expression.
   
   Something more like this feels better to me:
   
   ```rust
   pub enum Predicate {
   // By not having Unary / Binary / Set here, and removing
   // that layer in favour of having the ops themselves
   // as part of this enum, there is no need for code that
   // raises errors when the wrong type of `Op` is used, since those
   // issues are prevented by the type system at compile time
   
   // Unary ops now only contain one param, the term, and so can be
   // implemented more simply
   IsNull(Reference),
   // ... other Unary ops here,
   
   // Binary / Set ops ditch the `BinaryExpression`, with the literal
   // and reference being available directly
   GreaterThan {
   literal: Datum,
   reference: Reference,
   },
   // ... other Binary ops here,
   
   In {
   literals: HashSet,
   reference: Reference,
   },
   // ... other Set ops here
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553740208


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > I think this is the advantage of the op approach, it's left to user to 
decide what to do. If they want new op to be ignored, they could use a default 
match arm, otherwise they could choose to do pattern match against each 
operator.
   
   Yes, exactly. So I'm +1 for the op approach.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553653540


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > ...wouldn't I have match op anyway, so having a default match arm in the 
implementation would solve the issue for the downstream user?
   
   I think this is the advantage of the op approach, it's left to user to 
decide what to do. If they want new op to be ignored, they could use a default 
match arm, otherwise they could choose to do pattern match against each 
operator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553297280


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > Another option is to use following method:
   > 
   > ```rust
   > trait BoundPredicateVisitor {
   >   fn visitor_op( self, op: , reference: 
, results: Vec) -> Result {}
   > }
   > ```
   
   I think I'm in favor of keeping the trait as generic as possible - so we 
don't have to modify it, if we have to support a new operator. 
   
   When implementing this trait (no matter which variant) I'd most likely have 
to overwrite the default behaviour anyway? So I wouldn't bother providing any 
in the first place and keep the trait as generic and future proof as possible? 
If that makes any sense to you?
   
   >This way we can force user to think about added new operator, otherwise it 
will throw compiler error, but this may break things when we add new operator.
   
   ...wouldn't I have match `op` anyway, so having a default match arm in the 
implementation would solve the issue for the downstream user?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553297280


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > Another option is to use following method:
   > 
   > ```rust
   > trait BoundPredicateVisitor {
   >   fn visitor_op( self, op: , reference: 
, results: Vec) -> Result {}
   > }
   > ```
   
   I think I'm in favor of keeping the trait as generic as possible - so we 
don't have to modify it, if we have to support a new operator. 
   
   When implementing this trait (no matter which variant) I'd most likely have 
to overwrite the default behaviour anyway? So I wouldn't bother providing any 
in the first place and keep the trait as generic and future proof as possible? 
If that makes any sense to you?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553279624


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   > If we add default implementations that return something like 
Err(OperatorNotImplemented), then if we add an operator, wouldn't existing code 
work for existing operators and only break if users then try to use the new 
operator without having updated any visitors they have to use the new operator?
   
   Yes, I agree with that.
   
   Another option is to use following method:
   ```rust
   trait BoundPredicateVisitor {
 fn visitor_op( self, op: , reference: 
, results: Vec) -> Result {}
   }
   ```
   
   This way we can force user to think about added new operator, otherwise it 
will throw compiler error, but this may break things when we add new operator.
   
   I think the two approaches have different pros and cons, let's wait to see 
others's comments. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553253806


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   If we add default implementations that return something like 
`Err(OperatorNotImplemented)`, then if we add an operator, wouldn't existing 
code work for existing operators and only break if users then try to use the 
new operator without having updated any visitors they have to use the new 
operator? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553220669


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,366 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn always_true( self) -> Result;
+fn always_false( self) -> Result;
+
+fn and( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn or( self, lhs: Self::T, rhs: Self::T) -> Result;
+fn not( self, inner: Self::T) -> Result;
+
+fn is_null( self, reference: ) -> Result;

Review Comment:
   The problem with this approach is that when we add an operator, we need to 
modify the trait. We can still keep it compatible with default function, but 
the actual behavior may break when user upgrades library. cc @viirya @Xuanwo 
@marvinlanhenke What do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553215173


##
crates/iceberg/src/expr/predicate.rs:
##
@@ -217,6 +249,10 @@ impl Display for SetExpression {
 /// Unbound predicate expression before binding to a schema.
 #[derive(Debug, PartialEq)]
 pub enum Predicate {
+/// AlwaysTrue predicate, for example, `TRUE`.
+AlwaysTrue,
+/// AlwaysFalse predicate, for example, `FALSE`.
+AlwaysFalse,

Review Comment:
   Reason to me, thanks for the explaination.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553098425


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,363 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn visit( self, predicate: ) -> Result {

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553082179


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,363 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn visit( self, predicate: ) -> Result {

Review Comment:
   Ok, happy to change, will do so



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


sdd commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1553077742


##
crates/iceberg/src/expr/predicate.rs:
##
@@ -217,6 +249,10 @@ impl Display for SetExpression {
 /// Unbound predicate expression before binding to a schema.
 #[derive(Debug, PartialEq)]
 pub enum Predicate {
+/// AlwaysTrue predicate, for example, `TRUE`.
+AlwaysTrue,
+/// AlwaysFalse predicate, for example, `FALSE`.
+AlwaysFalse,

Review Comment:
   It's in here because it is needed in two places in `InclusiveProjection`:
   
   - First, when we "unbind", transforming a `BoundPredicate` into an unbound 
`Predicate`: 
https://github.com/apache/iceberg-rust/pull/321/files#diff-419289c8023c272de50ee4352320ad217fa7c2802f7f633d26e55ece30426be6R41
   - Second, when we build up a `Predicate` by "and"ing together a (possibly 
empty) list of `Predicate`s, starting with an `AlwaysTrue` to ensure that we 
return a valid `Predicate` even in the empty list case: 
https://github.com/apache/iceberg-rust/pull/321/files#diff-419289c8023c272de50ee4352320ad217fa7c2802f7f633d26e55ece30426be6R76
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] Add `BoundPredicateVisitor` trait [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 commented on code in PR #320:
URL: https://github.com/apache/iceberg-rust/pull/320#discussion_r1552969104


##
crates/iceberg/src/expr/visitors/bound_predicate_visitor.rs:
##
@@ -0,0 +1,363 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use crate::expr::{BoundPredicate, BoundReference, PredicateOperator};
+use crate::spec::Datum;
+use crate::Result;
+use fnv::FnvHashSet;
+
+pub trait BoundPredicateVisitor {
+type T;
+
+fn visit( self, predicate: ) -> Result {

Review Comment:
   How about we move this method to a standalone function:
   ```rust
   fn visit(visitor:  V, predicate: 
BoundPredicate) -> Result {}
   ```
   
   It's kind of odd to me to implement the travelling logic in trait's default 
function. 



##
crates/iceberg/src/expr/predicate.rs:
##
@@ -217,6 +249,10 @@ impl Display for SetExpression {
 /// Unbound predicate expression before binding to a schema.
 #[derive(Debug, PartialEq)]
 pub enum Predicate {
+/// AlwaysTrue predicate, for example, `TRUE`.
+AlwaysTrue,
+/// AlwaysFalse predicate, for example, `FALSE`.
+AlwaysFalse,

Review Comment:
   We had some discussion in [this 
pr](https://github.com/apache/iceberg-rust/pull/227), and the conclusion is 
that currently we don't find it useful to add it to `Predicate`, could you 
elaborate why we need this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org