liurenjie1024 commented on code in PR #1876:
URL: https://github.com/apache/iceberg-rust/pull/1876#discussion_r2692910520
##########
crates/iceberg/src/spec/table_metadata.rs:
##########
@@ -461,9 +463,58 @@ impl TableMetadata {
file_io: &FileIO,
metadata_location: impl AsRef<str>,
) -> Result<()> {
+ let json_data = serde_json::to_vec(self)?;
+
+ // Check if compression is enabled via table properties
+ let codec = self
+ .properties
+ .get(TableProperties::PROPERTY_METADATA_COMPRESSION_CODEC)
Review Comment:
Ideally we should use `TableProperties`, as described here:
https://github.com/apache/iceberg-rust/issues/2028
##########
crates/iceberg/src/spec/table_metadata.rs:
##########
@@ -461,9 +463,58 @@ impl TableMetadata {
file_io: &FileIO,
metadata_location: impl AsRef<str>,
) -> Result<()> {
+ let json_data = serde_json::to_vec(self)?;
+
+ // Check if compression is enabled via table properties
+ let codec = self
+ .properties
+ .get(TableProperties::PROPERTY_METADATA_COMPRESSION_CODEC)
+ .map(|s| s.as_str());
+
+ // Use case-insensitive comparison to match Java implementation
+ let (data_to_write, actual_location) = match codec.map(|s|
s.to_lowercase()).as_deref() {
+ Some("gzip") => {
Review Comment:
I don't think we should use such magic constants here, instead we should
create an enum CompressionCodec for it.
##########
crates/iceberg/src/catalog/metadata_location.rs:
##########
@@ -15,38 +15,95 @@
// specific language governing permissions and limitations
// under the License.
+use std::collections::HashMap;
use std::fmt::Display;
use std::str::FromStr;
use uuid::Uuid;
+use crate::spec::{TableMetadata, TableProperties};
use crate::{Error, ErrorKind, Result};
+/// The file extension suffix for gzip compressed metadata files
+const GZIP_SUFFIX: &str = ".gz";
+
/// Helper for parsing a location of the format:
`<location>/metadata/<version>-<uuid>.metadata.json`
+/// or with compression:
`<location>/metadata/<version>-<uuid>.gz.metadata.json`
#[derive(Clone, Debug, PartialEq)]
pub struct MetadataLocation {
table_location: String,
version: i32,
id: Uuid,
+ compression_suffix: Option<String>,
}
impl MetadataLocation {
+ /// Determines the compression suffix from table properties.
+ fn compression_suffix_from_properties(properties: &HashMap<String,
String>) -> Option<String> {
Review Comment:
This method should be part of the new CompressionCodec.
##########
crates/iceberg/src/spec/table_properties.rs:
##########
@@ -49,6 +56,8 @@ pub struct TableProperties {
pub write_format_default: String,
/// The target file size for files.
pub write_target_file_size_bytes: usize,
+ /// Compression codec for metadata files (JSON), None means no compression
+ pub metadata_compression_codec: Option<String>,
Review Comment:
See https://github.com/apache/iceberg-rust/pull/1876/changes#r2692908939
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]