wgtmac commented on code in PR #653:
URL: https://github.com/apache/iceberg-cpp/pull/653#discussion_r3285555628
##########
src/iceberg/type_fwd.h:
##########
@@ -77,6 +80,8 @@ class TimeType;
class TimestampBase;
class TimestampType;
class TimestampTzType;
+class TimestampNsType;
Review Comment:
ditto
##########
src/iceberg/expression/literal.cc:
##########
@@ -395,8 +438,11 @@ bool Comparable(TypeId lhs, TypeId rhs) {
case TypeId::kLong:
case TypeId::kTimestamp:
case TypeId::kTimestampTz:
+ case TypeId::kTimestampNs:
+ case TypeId::kTimestampTzNs:
return rhs == TypeId::kLong || rhs == TypeId::kTimestamp ||
- rhs == TypeId::kTimestampTz;
+ rhs == TypeId::kTimestampTz || rhs == TypeId::kTimestampNs ||
Review Comment:
This looks incorrect to me. Should we be strict that only identical types
are allowed to compare? It looks also dangerous to compare a timestamp value
against a long value. Should we remove that support as well?
##########
src/iceberg/type_fwd.h:
##########
@@ -46,6 +46,8 @@ enum class TypeId {
kTime,
kTimestamp,
kTimestampTz,
+ kTimestampNs,
+ kTimestampTzNs,
Review Comment:
Should we sort them as
```
kTimestamp,
kTimestampNs,
kTimestampTz,
kTimestampTzNs,
```
##########
src/iceberg/util/bucket_util.cc:
##########
@@ -63,6 +63,16 @@ int32_t HashLiteral<TypeId::kTimestampTz>(const Literal&
literal) {
return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
}
+template <>
+int32_t HashLiteral<TypeId::kTimestampNs>(const Literal& literal) {
+ return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
+}
+
+template <>
+int32_t HashLiteral<TypeId::kTimestampTzNs>(const Literal& literal) {
+ return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
+}
Review Comment:
Perhaps it is worth adding a dedicated utility class/file for temporal types
just like Java for reuse.
##########
src/iceberg/util/bucket_util.cc:
##########
@@ -63,6 +63,16 @@ int32_t HashLiteral<TypeId::kTimestampTz>(const Literal&
literal) {
return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
}
+template <>
+int32_t HashLiteral<TypeId::kTimestampNs>(const Literal& literal) {
+ return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
+}
+
+template <>
+int32_t HashLiteral<TypeId::kTimestampTzNs>(const Literal& literal) {
+ return BucketUtils::HashLong(std::get<int64_t>(literal.value()));
+}
Review Comment:
I think the original review comment had hallucination. Sorry about that.
The actual workflow is as below:
```
private static class BucketTimestampNano extends Bucket<Long>
implements SerializableFunction<Long, Integer> {
private BucketTimestampNano(int numBuckets) {
super(numBuckets);
}
@Override
protected int hash(Long nanos) {
return BucketUtil.hash(DateTimeUtil.nanosToMicros(nanos));
}
}
```
We can see that it also calls floorDiv inside:
```
public static long nanosToMicros(long nanos) {
return Math.floorDiv(nanos, NANOS_PER_MICRO);
}
```
So my original (AI) suggestion was wrong. Please follow the same approach to
use floorDiv here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]