Re: [PR] feat: Project transform [iceberg-rust]

2024-04-05 Thread via GitHub


liurenjie1024 merged PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2039348517

   > Hi, @marvinlanhenke Thanks for pr, it looks great! I have some small 
suggestion to restructure the code to make it easier for review. Really 
greatful for these tests!
   
   Thanks for the review - I did some minor fixes according to your 
suggestions. If no other comments, I think we're good to go.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1553275061


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,323 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+match self {
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => 
Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+,
+BoundPredicate::Set(expr) => 
Ok(Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+,
+_ => Ok(None),
+},
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => self.project_binary(name, 
expr, ),
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Truncate(width) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, Some(*width))
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+match predicate {
+BoundPredicate::Unary(expr) => 
Self::project_unary(expr.op(), name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, None)
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+}
+}
+_ => Ok(None),
+}
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Creates a unary predicate from a given operator and a reference name.
+fn project_unary(op: PredicateOperator, name: String) -> 
Result> {
+Ok(Some(Predicate::Unary(UnaryExpression::new(
+op,
+Reference::new(name),
+
+}
+
+/// Attempts to create a binary predicate based on a binary expression,
+/// if applicable.
+///
+/// This method evaluates a given binary expression and, if the operation
+/// is equality (`Eq`) and the literal can be transformed, constructs a
+/// `Predicate::Binary`variant representing the binary operation.
+fn project_binary(

Review Comment:
   I missed that, we don't need a Generic. Fixed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1553272539


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,323 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+match self {
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => 
Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+,
+BoundPredicate::Set(expr) => 
Ok(Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+,
+_ => Ok(None),
+},
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => self.project_binary(name, 
expr, ),
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Truncate(width) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, Some(*width))
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+match predicate {
+BoundPredicate::Unary(expr) => 
Self::project_unary(expr.op(), name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, None)
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+}
+}
+_ => Ok(None),
+}
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Creates a unary predicate from a given operator and a reference name.
+fn project_unary(op: PredicateOperator, name: String) -> 
Result> {
+Ok(Some(Predicate::Unary(UnaryExpression::new(
+op,
+Reference::new(name),
+
+}
+
+/// Attempts to create a binary predicate based on a binary expression,
+/// if applicable.
+///
+/// This method evaluates a given binary expression and, if the operation
+/// is equality (`Eq`) and the literal can be transformed, constructs a
+/// `Predicate::Binary`variant representing the binary operation.
+fn project_binary(
+,
+name: String,
+expr: ,
+func: ,
+) -> Result> {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+
+}
+
+/// Projects a binary expression to a predicate with an adjusted boundary.
+///
+/// Checks if the literal within the given binary expression is
+/// transformable. If transformable, it proceeds to potentially adjust
+/// the boundary of the expression based on the comparison operator (`op`).
+/// The potential adjustments involve incrementing or decrementing the
+/// literal value and changing the `PredicateOperator` itself to its
+/// inclusive variant.
+fn project_binary_with_adjusted_boundary(
+,
+name: String,
+expr: ,
+func: ,
+

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1553271703


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,323 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+match self {
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => 
Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+,
+BoundPredicate::Set(expr) => 
Ok(Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+,
+_ => Ok(None),
+},
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => self.project_binary(name, 
expr, ),
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Truncate(width) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, Some(*width))
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+match predicate {
+BoundPredicate::Unary(expr) => 
Self::project_unary(expr.op(), name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, None)
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+}
+}
+_ => Ok(None),
+}
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Creates a unary predicate from a given operator and a reference name.
+fn project_unary(op: PredicateOperator, name: String) -> 
Result> {
+Ok(Some(Predicate::Unary(UnaryExpression::new(
+op,
+Reference::new(name),
+
+}
+
+/// Attempts to create a binary predicate based on a binary expression,
+/// if applicable.
+///
+/// This method evaluates a given binary expression and, if the operation
+/// is equality (`Eq`) and the literal can be transformed, constructs a
+/// `Predicate::Binary`variant representing the binary operation.
+fn project_binary(
+,
+name: String,
+expr: ,
+func: ,
+) -> Result> {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+
+}
+
+/// Projects a binary expression to a predicate with an adjusted boundary.
+///
+/// Checks if the literal within the given binary expression is
+/// transformable. If transformable, it proceeds to potentially adjust
+/// the boundary of the expression based on the comparison operator (`op`).
+/// The potential adjustments involve incrementing or decrementing the
+/// literal value and changing the `PredicateOperator` itself to its
+/// inclusive variant.
+fn project_binary_with_adjusted_boundary(
+,
+name: String,
+expr: ,
+func: ,
+

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-05 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1553262971


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,323 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+match self {
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => 
Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+,
+BoundPredicate::Set(expr) => 
Ok(Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+,
+_ => Ok(None),
+},
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => self.project_binary(name, 
expr, ),
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Truncate(width) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, Some(*width))
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+match predicate {
+BoundPredicate::Unary(expr) => 
Self::project_unary(expr.op(), name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, None)
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+}
+}
+_ => Ok(None),
+}
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Creates a unary predicate from a given operator and a reference name.
+fn project_unary(op: PredicateOperator, name: String) -> 
Result> {
+Ok(Some(Predicate::Unary(UnaryExpression::new(
+op,
+Reference::new(name),
+
+}
+
+/// Attempts to create a binary predicate based on a binary expression,
+/// if applicable.
+///
+/// This method evaluates a given binary expression and, if the operation
+/// is equality (`Eq`) and the literal can be transformed, constructs a
+/// `Predicate::Binary`variant representing the binary operation.
+fn project_binary(
+,
+name: String,
+expr: ,
+func: ,
+) -> Result> {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+
+}
+
+/// Projects a binary expression to a predicate with an adjusted boundary.
+///
+/// Checks if the literal within the given binary expression is
+/// transformable. If transformable, it proceeds to potentially adjust
+/// the boundary of the expression based on the comparison operator (`op`).
+/// The potential adjustments involve incrementing or decrementing the
+/// literal value and changing the `PredicateOperator` itself to its
+/// inclusive variant.
+fn project_binary_with_adjusted_boundary(
+,
+name: String,
+expr: ,
+func: ,
+

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-04 Thread via GitHub


liurenjie1024 commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2039007641

   cc @Fokko Do you have other comments?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-04 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1548808318


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,323 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+match self {
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => 
Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+,
+BoundPredicate::Set(expr) => 
Ok(Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+,
+_ => Ok(None),
+},
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => self.project_binary(name, 
expr, ),
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Truncate(width) => match predicate {
+BoundPredicate::Unary(expr) => Self::project_unary(expr.op(), 
name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, Some(*width))
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+},
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+match predicate {
+BoundPredicate::Unary(expr) => 
Self::project_unary(expr.op(), name),
+BoundPredicate::Binary(expr) => {
+self.project_binary_with_adjusted_boundary(name, expr, 
, None)
+}
+BoundPredicate::Set(expr) => self.project_set(expr, name, 
),
+_ => Ok(None),
+}
+}
+_ => Ok(None),
+}
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Creates a unary predicate from a given operator and a reference name.
+fn project_unary(op: PredicateOperator, name: String) -> 
Result> {
+Ok(Some(Predicate::Unary(UnaryExpression::new(
+op,
+Reference::new(name),
+
+}
+
+/// Attempts to create a binary predicate based on a binary expression,
+/// if applicable.
+///
+/// This method evaluates a given binary expression and, if the operation
+/// is equality (`Eq`) and the literal can be transformed, constructs a
+/// `Predicate::Binary`variant representing the binary operation.
+fn project_binary(
+,
+name: String,
+expr: ,
+func: ,
+) -> Result> {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Ok(Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+
+}
+
+/// Projects a binary expression to a predicate with an adjusted boundary.
+///
+/// Checks if the literal within the given binary expression is
+/// transformable. If transformable, it proceeds to potentially adjust
+/// the boundary of the expression based on the comparison operator (`op`).
+/// The potential adjustments involve incrementing or decrementing the
+/// literal value and changing the `PredicateOperator` itself to its
+/// inclusive variant.
+fn project_binary_with_adjusted_boundary(
+,
+name: String,
+expr: ,
+func: ,
+

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547819858


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {

Review Comment:
   Yeah, it looks much better now, thanks! I'll take a careful review later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547785953


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {

Review Comment:
   @liurenjie1024
   I did a refactor changing the structure (matching order). I also extracted 
common functionality, renamed those helpers and updated the docs. I hope not 
only the structure but the overall design is more readable and understandable 
with those changes applied?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547681855


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {

Review Comment:
   sure, I'm already implementing it - since I wanted to compare for myself.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547676828


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {

Review Comment:
   I think some code duplication is worth so that we can have better 
readability?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547667177


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {

Review Comment:
   I had the structure you suggested in an earlier version. I changed it the 
other way around since `predicate` has the smaller cardinality, which allows me 
to group more transforms into a single predicate match arm. I can change it 
back, however this would introduce more match arms and some code duplication?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547649993


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {
+Transform::Identity
+| Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+_ => None,
+},
+BoundPredicate::Binary(expr) => match self {
+Transform::Identity => 
Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+))),
+Transform::Bucket(_) => {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+)))
+}
+Transform::Truncate(width) => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+Some(*width),
+)?
+}
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+None,
+)?
+}
+_ => None,
+},
+BoundPredicate::Set(expr) => match self {
+Transform::Identity => Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+))),
+Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => {
+if expr.op() != PredicateOperator::In
+|| expr.literals().iter().any(|d| 
!self.can_transform(d))
+{
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.transform_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+_ => None,
+};
+
+Ok(projection)
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Transform each literal value of `FnvHashSet`
+fn transform_set(
+,
+literals: ,
+func: ,
+) -> Result> {
+let mut new_set = FnvHashSet::default();
+
+for lit in literals {
+let datum = func.transform_literal_result(lit)?;
+
+if let Some(AdjustedProjection::Single(d)) =
+  

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547653811


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {
+Transform::Identity
+| Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+_ => None,
+},
+BoundPredicate::Binary(expr) => match self {
+Transform::Identity => 
Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+))),
+Transform::Bucket(_) => {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+)))
+}
+Transform::Truncate(width) => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+Some(*width),
+)?
+}
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+None,
+)?
+}
+_ => None,
+},
+BoundPredicate::Set(expr) => match self {
+Transform::Identity => Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+))),
+Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => {
+if expr.op() != PredicateOperator::In
+|| expr.literals().iter().any(|d| 
!self.can_transform(d))
+{
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.transform_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+_ => None,
+};
+
+Ok(projection)
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Transform each literal value of `FnvHashSet`
+fn transform_set(
+,
+literals: ,
+func: ,
+) -> Result> {
+let mut new_set = FnvHashSet::default();
+
+for lit in literals {
+let datum = func.transform_literal_result(lit)?;
+
+if let Some(AdjustedProjection::Single(d)) =
+ 

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547639860


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {
+Transform::Identity
+| Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+_ => None,
+},
+BoundPredicate::Binary(expr) => match self {
+Transform::Identity => 
Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+))),
+Transform::Bucket(_) => {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+)))
+}
+Transform::Truncate(width) => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+Some(*width),
+)?
+}
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+None,
+)?
+}
+_ => None,
+},
+BoundPredicate::Set(expr) => match self {
+Transform::Identity => Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+))),
+Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => {
+if expr.op() != PredicateOperator::In
+|| expr.literals().iter().any(|d| 
!self.can_transform(d))
+{
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.transform_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+_ => None,
+};
+
+Ok(projection)
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Transform each literal value of `FnvHashSet`
+fn transform_set(
+,
+literals: ,
+func: ,
+) -> Result> {
+let mut new_set = FnvHashSet::default();
+
+for lit in literals {
+let datum = func.transform_literal_result(lit)?;
+
+if let Some(AdjustedProjection::Single(d)) =
+ 

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2031700958

   > Hi, @marvinlanhenke Thanks for pr, it looks great! I have some small 
suggestion to restructure the code to make it easier for review. Really 
greatful for these tests!
   
   Thanks for the review, I'll get to your suggestions - those should be easy 
to fix.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-02 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1547488931


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +269,300 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match predicate {
+BoundPredicate::Unary(expr) => match self {
+Transform::Identity
+| Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+_ => None,
+},
+BoundPredicate::Binary(expr) => match self {
+Transform::Identity => 
Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+))),
+Transform::Bucket(_) => {
+if expr.op() != PredicateOperator::Eq || 
!self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+func.transform_literal_result(expr.literal())?,
+)))
+}
+Transform::Truncate(width) => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+Some(*width),
+)?
+}
+Transform::Year | Transform::Month | Transform::Day | 
Transform::Hour => {
+if !self.can_transform(expr.literal()) {
+return Ok(None);
+}
+
+self.transform_projected_boundary(
+name,
+expr.literal(),
+(),
+,
+None,
+)?
+}
+_ => None,
+},
+BoundPredicate::Set(expr) => match self {
+Transform::Identity => Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+))),
+Transform::Bucket(_)
+| Transform::Truncate(_)
+| Transform::Year
+| Transform::Month
+| Transform::Day
+| Transform::Hour => {
+if expr.op() != PredicateOperator::In
+|| expr.literals().iter().any(|d| 
!self.can_transform(d))
+{
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.transform_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+_ => None,
+};
+
+Ok(projection)
+}
+
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> bool {
+let input_type = datum.data_type().clone();
+self.result_type(::Primitive(input_type)).is_ok()
+}
+
+/// Transform each literal value of `FnvHashSet`
+fn transform_set(
+,
+literals: ,
+func: ,
+) -> Result> {
+let mut new_set = FnvHashSet::default();
+
+for lit in literals {
+let datum = func.transform_literal_result(lit)?;
+
+if let Some(AdjustedProjection::Single(d)) =
+  

Re: [PR] feat: Project transform [iceberg-rust]

2024-04-01 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2029473672

   > Hi, @marvinlanhenke I skimmed through the code and it seems that we only 
need to split tests into separate modules to make it easier to maintain and 
read. But I agree that we can merge it first to unblock following features, so 
it's up to you.
   
   Take your time while reviewing in the meantime - I'll open another draft 
with a possible refactor and we can compare whats best? I 'm thinking of 
splitting the code into Identity, Bucket, Truncate and Temporal logic - 
including the tests. This however might introduce some minor code duplication 
(I'll have to try and see for myself though)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-01 Thread via GitHub


liurenjie1024 commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2029468475

   Hi, @marvinlanhenke I skimmed through the code and it seems that we only 
need to split tests into separate modules to make it easier to maintain and 
read. But I agree that we can merge it first to unblock following features, so 
it's up to you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-01 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2029456431

   > > @Fokko @liurenjie1024 PTAL
   > > I ported all the tests - and fixed #311. Now, all tests are passing and 
align with the Java implementation. I think we only can optimize the design/ 
structure by perhaps moving the test-suite to the respective transforms (e.g. 
bucket.rs, etc)? Other than that I think we are good to go?
   > 
   > Hi, @marvinlanhenke Thanks, I'll take a careful look of this pr. About 
moving the specific logic into respective transforms, do you plan to do it in 
this pr or in following prs?
   
   ... I haven made up my mind yet. I think we can merge this if its okay, in 
order to unblock #253?
   
   Also I'm not so sure anymore we have to move the logic at all? Since most of 
the helper functions handle logic that is not dependent of the type of 
transform? And don't know - perhaps you can outline the refactor high-level 
what you have in mind?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-01 Thread via GitHub


liurenjie1024 commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2029440339

   > @Fokko @liurenjie1024 PTAL
   > 
   > I ported all the tests - and fixed #311. Now, all tests are passing and 
align with the Java implementation. I think we only can optimize the design/ 
structure by perhaps moving the test-suite to the respective transforms (e.g. 
bucket.rs, etc)? Other than that I think we are good to go?
   
   Hi, @marvinlanhenke Thanks, I'll take a careful look of this pr. About 
moving the specific logic into respective transforms, do you plan to do it in 
this pr or in following prs?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-04-01 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2029426833

   @Fokko @liurenjie1024 
   PTAL
   
   I ported all the tests - and fixed #311. Now, all tests are passing and 
align with the Java implementation.
   I think we only can optimize the design/ structure by perhaps moving the 
test-suite to the respective transforms (e.g. bucket.rs, etc)? Other than that 
I think we are good to go?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-30 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2028045351

   @Fokko @liurenjie1024 
   I'm having trouble understanding/ verifiying the test-case for 
[timestamps_day_inclusive_epoch](https://github.com/apache/iceberg/blob/d350c9b8c995a2953aa8b80a0a1fc7cadc4dd16a/api/src/test/java/org/apache/iceberg/transforms/TestTimestampsProjection.java#L168)
 for the case `greater than`.
   
   If I understood correctly timestamps are handled as 
[follows](https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/transforms/Timestamps.java#L134-L137):
 
   - call `truncateLong` -> add +1L, rewrite OP to GT_EQ and apply 
transformation
   - then `fixInclusiveTimeProjection` -> for GT_EQ -> do nothing, simply return
   
   so this would leave me with a predicate with an OP -> GT_EQ and a new 
literal (+1L) -> "1970-01-02T00:00:00.00"?
   
   Why would the test then check for "1970-01-01" // I can't find the part I'm 
missing; Is there some part that adjusts the boundary after the fact? Or is the 
+1L in truncateLong not applied? This is the last part I need to port.
   
   Thanks for your help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2026487818

   Hi, @marvinlanhenke Thanks for your contribution, the overall design looks 
good to me! Please take your time and enjoy time with family


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1543956147


##
crates/iceberg/src/transform/mod.rs:
##
@@ -37,6 +36,15 @@ pub trait TransformFunction: Send {
 fn transform(, input: ArrayRef) -> Result;
 /// transform_literal will take an input literal and transform it into a 
new literal.
 fn transform_literal(, input: ) -> Result>;
+/// wrapper

Review Comment:
   ```suggestion
   /// A thin wrapper around `transform_literal` to return error even when 
it's `None`.
   ```



##
crates/iceberg/src/spec/transform.rs:
##
@@ -384,6 +394,14 @@ impl Transform {
 Ok(projection)
 }
 
+/// Check if `Transform` is applicable on datum's `PrimitiveType`
+fn can_transform(, datum: ) -> Result<()> {

Review Comment:
   Why not just return bool directly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1543730206


##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Truncate(10);
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 0");
+assert_eq!(format!("{}", result_set), "projected_name IN (0)");
+
+Ok(())
+}
+
+#[test]
+fn test_identity_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Identity;
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 5");
+assert_eq!(format!("{}", result_set), "projected_name IN (5, 6)");
+
+Ok(())
+}
+
+#[test]
+fn test_bucket_project() -> Result<()> {

Review Comment:
   @Fokko @liurenjie1024 
   ...I ported the suite for `Bucket` transforms - with all tests passing. So 
bucket impl seems to produce the correct results (at least no obvious bugs 
present). 
   
   However, PTAL. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2025357488

   @liurenjie1024 
   I think I covered most of your suggestions. 
   PTAL if the overall design and implementation in general is fine?
   
   ### Unresolved Issues:
   - [ ] How to handle PrimitiveType::Time and PrimitiveType::Timestamp for 
`Dates` projection and boundary adjustment
   - [ ] Move Transform specific logic into /transform (e.g. logic to handle 
Transform::Bucket projection -> /transform/bucket.rs)
   - [ ] Port Java testsuite
   
   to 1) I think we need to take PrimitiveType::Time and convert it to 
PrimitiveType::Date and then adjust the boundary. If this is correct, I still 
havent found a way to convert between those types. Perhaps someone can point me 
in the right direction?
   
   ---
   *due to easter-holidays (and my two children) - I can probably do only some 
minimal work on this over the weekend
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542937628


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.

Review Comment:
   I would suggest to extend the doc. The example doesn't explain why `a !=5`, 
`bucket(a, 10) as ba` can't be projected to `ba != bucket_apply(5, 10)`, 
because it may violate the guarantee since there are other values which could 
be hashed to `bucket_apply(5, 10)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542923641


##
crates/iceberg/src/spec/values.rs:
##
@@ -683,6 +684,47 @@ impl Datum {
 pub fn data_type() ->  {
 #type
 }
+
+/// Create a new `Datum` with adjusted boundary for projection
+pub fn boundary(, op: ) -> Result> {

Review Comment:
   yes - but still implemented on `Datum` or as a helper fn on `Transform` with 
param: `` ? it feels strange to impl `Datum` across two modules.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542923641


##
crates/iceberg/src/spec/values.rs:
##
@@ -683,6 +684,47 @@ impl Datum {
 pub fn data_type() ->  {
 #type
 }
+
+/// Create a new `Datum` with adjusted boundary for projection
+pub fn boundary(, op: ) -> Result> {

Review Comment:
   yes - but still implemented on `Datum` or as a helper fn on `Transform` with 
param: `` ? it feels strange to impl `Datum` across two modules.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542916299


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.

Review Comment:
   should I extend the docstring then? or doesn't the example specify this 
enough?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542902827


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match self {
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => {
+if expr.op() != PredicateOperator::Eq {
+return Ok(None);
+}
+
+let new_datum = 
func.transform_literal(expr.literal())?.ok_or_else(|| {
+Error::new(
+ErrorKind::DataInvalid,
+"Transformed datum must not be 'None'",
+)
+})?;
+
+Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+new_datum,
+)))
+}
+BoundPredicate::Set(expr) => {
+if expr.op() != PredicateOperator::In {
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.apply_transform_on_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => 
Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+))),
+BoundPredicate::Set(expr) => 
Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+))),
+_ => None,
+},
+Transform::Truncate(_) => match predicate {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => {
+let op = expr.op();
+let primitive = expr.literal().literal();
+
+match primitive {
+PrimitiveLiteral::Int(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+PrimitiveLiteral::Long(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+PrimitiveLiteral::Decimal(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+PrimitiveLiteral::Fixed(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+_ => return Ok(None),
+}
+}
+BoundPredicate::Set(expr) => {
+if expr.op() != PredicateOperator::In {
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.apply_transform_on_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+_ => None,
+};
+
+Ok(projection)
+}
+
+/// Transform 

Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542870129


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;

Review Comment:
   You mean, I do the check right before I call `func.transform_literal(...)`. 
I have to do it in every match arm, since I need access to the `Datum` in order 
to get the `PrimitiveType` as input for `result_type()`? Do I understand you 
correctly?
   
   ...something like?
   ```Rust
   let input_type = expr.literal().data_type();
   self.result_type(::Primitive(input_type.clone()))?;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542878163


##
crates/iceberg/src/transform/mod.rs:
##
@@ -16,6 +16,7 @@
 // under the License.
 
 //! Transform function used to compute partition values.
+

Review Comment:
   The extra whitespace.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542877714


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;

Review Comment:
   Yes, exactly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542870129


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;

Review Comment:
   You mean, I do the check right before I call `func.transform_literal(...)`. 
I have to do it in every match arm, since I need access to the `Datum` in order 
to get the `PrimitiveType` as input for `result_type()`? Do I understand you 
correctly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542837809


##
crates/iceberg/src/transform/mod.rs:
##
@@ -16,6 +16,7 @@
 // under the License.
 
 //! Transform function used to compute partition values.
+

Review Comment:
   you mean the comment? or the extra whitespace?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542833995


##
crates/iceberg/src/spec/values.rs:
##
@@ -683,6 +684,47 @@ impl Datum {
 pub fn data_type() ->  {
 #type
 }
+
+/// Create a new `Datum` with adjusted boundary for projection
+pub fn boundary(, op: ) -> Result> {

Review Comment:
   How about moving this to transform module since this is only useful for 
transform?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2025035027

   > Basically, I got rid of the trait and implemented boundary on Datum itself 
- which makes more sense to me, now that I have a better overall picture with 
the dates transformations in mind.
   
   @marvinlanhenke Thanks for this, I also an boundary trait is a little over 
design, and implementing them directly on `Datum` seems better to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


liurenjie1024 commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542630137


##
crates/iceberg/src/spec/transform.rs:
##
@@ -261,6 +270,174 @@ impl Transform {
 _ => self == other,
 }
 }
+
+/// Projects a given predicate according to the transformation
+/// specified by the `Transform` instance.
+///
+/// This allows predicates to be effectively applied to data
+/// that has undergone transformation, enabling efficient querying
+/// and filtering based on the original, untransformed data.
+///
+/// # Example
+/// Suppose, we have row filter `a = 10`, and a partition spec
+/// `bucket(a, 37) as bs`, if one row matches `a = 10`, then its partition
+/// value should match `bucket(10, 37) as bs`, and we project `a = 10` to
+/// `bs = bucket(10, 37)`
+pub fn project(, name: String, predicate: ) -> 
Result> {
+let func = create_transform_function(self)?;
+
+let projection = match self {
+Transform::Bucket(_) => match predicate {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => {
+if expr.op() != PredicateOperator::Eq {
+return Ok(None);
+}
+
+let new_datum = 
func.transform_literal(expr.literal())?.ok_or_else(|| {
+Error::new(
+ErrorKind::DataInvalid,
+"Transformed datum must not be 'None'",
+)
+})?;
+
+Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+new_datum,
+)))
+}
+BoundPredicate::Set(expr) => {
+if expr.op() != PredicateOperator::In {
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.apply_transform_on_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+Transform::Identity => match predicate {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => 
Some(Predicate::Binary(BinaryExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literal().to_owned(),
+))),
+BoundPredicate::Set(expr) => 
Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+expr.literals().to_owned(),
+))),
+_ => None,
+},
+Transform::Truncate(_) => match predicate {
+BoundPredicate::Unary(expr) => 
Some(Predicate::Unary(UnaryExpression::new(
+expr.op(),
+Reference::new(name),
+))),
+BoundPredicate::Binary(expr) => {
+let op = expr.op();
+let primitive = expr.literal().literal();
+
+match primitive {
+PrimitiveLiteral::Int(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+PrimitiveLiteral::Long(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+PrimitiveLiteral::Decimal(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+PrimitiveLiteral::Fixed(v) => {
+self.apply_transform_boundary(name, v, op, )?
+}
+_ => return Ok(None),
+}
+}
+BoundPredicate::Set(expr) => {
+if expr.op() != PredicateOperator::In {
+return Ok(None);
+}
+
+Some(Predicate::Set(SetExpression::new(
+expr.op(),
+Reference::new(name),
+self.apply_transform_on_set(expr.literals(), )?,
+)))
+}
+_ => None,
+},
+_ => None,
+};
+
+Ok(projection)
+}
+
+/// Transform each 

Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2024950370

   @Fokko 
   Is it correct to assume, as far as I understood the Java implementation, 
that we support `dates` projection only for `year, month, and day`?
   
   @liurenjie1024
   I converted this to a draft - since I had to change the design. I'll push 
this draft, so you can verify. Basically, I got rid of the trait and 
implemented `boundary` on `Datum` itself - which makes more sense to me, now 
that I have a better overall picture with the dates transformations in mind.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2024663637

   > Thanks for picking this up @marvinlanhenke
   > 
   > The most important part of adding projection is making that they are 
absolutely correct. If rust would generate something different than Java of 
Python, leads to data incorrectness since it would not correctly evaluate the 
partition predicates.
   
   @Fokko 
   Thanks for the extensive review. Not only did I miss some test but the 
complete implementation for `Dates` : 
https://github.com/apache/iceberg/blob/d350c9b8c995a2953aa8b80a0a1fc7cadc4dd16a/api/src/main/java/org/apache/iceberg/transforms/Dates.java#L123
 // I was just looking for the transforms for Years, Months, etc. So thank you 
for hinting me at this - I'll implement this as well - and try to port the Java 
Testsuite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


Fokko commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542486911


##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {

Review Comment:
   Could you port all the tests from Java: 
https://github.com/apache/iceberg/blob/main/api/src/test/java/org/apache/iceberg/transforms/TestTruncatesProjection.java



##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Truncate(10);
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 0");
+assert_eq!(format!("{}", result_set), "projected_name IN (0)");
+
+Ok(())
+}
+
+#[test]
+fn test_identity_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Identity;
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 5");
+assert_eq!(format!("{}", result_set), "projected_name IN (5, 6)");
+
+Ok(())
+}
+
+#[test]
+fn test_bucket_project() -> Result<()> {

Review Comment:
   Can you port the test suite from Java to Rust: 
https://github.com/apache/iceberg/blob/main/api/src/test/java/org/apache/iceberg/transforms/TestBucketingProjection.java



##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Truncate(10);
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 0");
+assert_eq!(format!("{}", result_set), "projected_name IN (0)");
+
+Ok(())
+}
+
+#[test]
+fn test_identity_project() -> Result<()> {
+let name = "projected_name".to_string();
+  

Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2024617450

   > LGTM, with the caveat that I'm not an expert on this part of the spec, but 
the PR seems well-structured.
   > 
   > I wouldn't mind seeing more comments in the tests though to explain the 
"why" aspect of each test.
   
   Thanks for your feedback. If the others approve "the correctness" of the PR 
- I'll add those comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


marvinlanhenke commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542468150


##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Truncate(10);
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 0");
+assert_eq!(format!("{}", result_set), "projected_name IN (0)");

Review Comment:
   Yes, I think your understanding is correct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


sdd commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542459274


##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Truncate(10);
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 0");
+assert_eq!(format!("{}", result_set), "projected_name IN (0)");

Review Comment:
   Just trying to follow what's happening here so that I understand. 
   
   So in the case of `result_binary`, the value of 5 gets truncated to 0, since 
the truncate transform when applied to an int effectively divides by 10, 
rounding down for all fractional values (ie, 1-9 get truncated to 0, 10-19 get 
truncated to 1, 20-29 get truncated to 2)?
   
   And in the case of `result_set`, since both 5 and 6 get truncated to the 
same value of 0, a set of (5, 6) becomes a set of (0)?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-28 Thread via GitHub


sdd commented on code in PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#discussion_r1542459274


##
crates/iceberg/src/spec/transform.rs:
##
@@ -398,6 +679,80 @@ mod tests {
 }
 }
 
+#[test]
+fn test_none_projection() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Void;
+let result_unary = transform.project(name.clone(), )?;
+assert!(result_unary.is_none());
+
+let transform = Transform::Year;
+let result_binary = transform.project(name.clone(), )?;
+assert!(result_binary.is_none());
+
+let transform = Transform::Month;
+let result_set = transform.project(name.clone(), )?;
+assert!(result_set.is_none());
+
+Ok(())
+}
+
+#[test]
+fn test_truncate_project() -> Result<()> {
+let name = "projected_name".to_string();
+let preds = TestPredicates::new();
+
+let transform = Transform::Truncate(10);
+
+let result_unary = transform.project(name.clone(), 
)?.unwrap();
+let result_binary = transform.project(name.clone(), 
)?.unwrap();
+let result_set = transform.project(name.clone(), )?.unwrap();
+
+assert_eq!(format!("{}", result_unary), "projected_name IS NULL");
+assert_eq!(format!("{}", result_binary), "projected_name = 0");
+assert_eq!(format!("{}", result_set), "projected_name IN (0)");

Review Comment:
   Just trying to follow what's happening here so that I understand. 
   
   So in the case of `result_binary`, the value of 5 gets truncated to 0, since 
the truncate transform when applied to an int effectively divides by 10, 
rounding down for all fractional values (ie, 1-9 get truncated to 0, 10-19 get 
truncated to 1, 20-29 get truncated to 2)?
   
   And in the case of the set, since both 5 and 6 get truncated to the same 
value of 0, a set of (5, 6) becomes a set of (0)?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



Re: [PR] feat: Project transform [iceberg-rust]

2024-03-27 Thread via GitHub


marvinlanhenke commented on PR #309:
URL: https://github.com/apache/iceberg-rust/pull/309#issuecomment-2023840212

   @liurenjie1024 @ZENOTME @sdd PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[PR] feat: Project transform [iceberg-rust]

2024-03-27 Thread via GitHub


marvinlanhenke opened a new pull request, #309:
URL: https://github.com/apache/iceberg-rust/pull/309

   ### Which issue does this PR close?
   Closes #264 
   
   ### Rationale for this change
   The ability to project row_filter to `Transform` partition. 
   This will unblock the Manifest & PartitionEvaluator, which will enable the 
pruning of manifest files in `fn plan_files`.
   
   ### What changes are included in this PR?
   - add `fn project(...)` on `Transform`
   - mainly based on [Java 
](https://github.com/apache/iceberg/tree/d350c9b8c995a2953aa8b80a0a1fc7cadc4dd16a/api/src/main/java/org/apache/iceberg/transforms)
 implementation
   
   ### Are these changes tested?
   Yes. Unit tests are included.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org