pacman82 opened a new issue, #3017: URL: https://github.com/apache/arrow-rs/issues/3017
**Describe the bug** <!-- A clear and concise description of what the bug is. --> This regards the output written by the `parquet` crate. Declaring a column to containt a timestamp of microseconds using a `LogicalType` causes the written file to **not** have a converted type. At least according to `parquet-tools`. **To Reproduce** <!-- Steps to reproduce the behavior: --> 1. Write a file `tmp.par` with a single column with type Timestamp of Microseconds, using a logical type. ```rust use std::sync::Arc; use parquet::{ basic::{LogicalType, Repetition, Type}, data_type::Int64Type, file::{properties::WriterProperties, writer::SerializedFileWriter}, format::{MicroSeconds, TimeUnit}, schema::types, }; fn main() { let mut data = Vec::with_capacity(1024); let logical_type = LogicalType::Timestamp { is_adjusted_to_u_t_c: false, unit: TimeUnit::MICROS(MicroSeconds {}), }; let field = Arc::new( types::Type::primitive_type_builder("col1", Type::INT64) .with_logical_type(Some(logical_type)) .with_repetition(Repetition::REQUIRED) .build() .unwrap(), ); let schema = Arc::new( types::Type::group_type_builder("schema") .with_fields(&mut vec![field]) .build() .unwrap(), ); // Write data let props = Arc::new(WriterProperties::builder().build()); let mut writer = SerializedFileWriter::new(&mut data, schema, props).unwrap(); let mut row_group_writer = writer.next_row_group().unwrap(); let mut column_writer = row_group_writer.next_column().unwrap().unwrap(); column_writer .typed::<Int64Type>() .write_batch(&[1, 2, 3, 4], None, None) .unwrap(); column_writer.close().unwrap(); row_group_writer.close().unwrap(); writer.close().unwrap(); // Write file for inspection with parqute tools std::fs::write("tmp.par", data).unwrap(); } ``` 2. Install `parquet-tools` in a virtual environment and inspect the file ```shell pip install parquet-tools==0.2.11 parquet-tools inspect tmp.par ``` The resulting output indicates no Converted type ``` ############ file meta data ############ created_by: parquet-rs version 26.0.0 num_columns: 1 num_rows: 4 num_row_groups: 1 format_version: 1.0 serialized_size: 143 ############ Columns ############ col1 ############ Column(col1) ############ name: col1 path: col1 max_definition_level: 0 max_repetition_level: 0 physical_type: INT64 logical_type: Timestamp(isAdjustedToUTC=false, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false) converted_type (legacy): NONE compression: UNCOMPRESSED (space_saved: 0%) ``` **Expected behavior** <!-- A clear and concise description of what you expected to happen. --> I would have expected the converted type to show up in the Metainformation emitted by parquet-tools. **Additional context** <!-- Add any other context about the problem here. --> Triggered by upstream `odbc2parquet` issue <https://github.com/pacman82/odbc2parquet/issues/284>. Azure can not seem to handle the output since migration to `LogicalType`. Previously misdiagnosed this to not set the converted type correctly in the schema information, this however does happen. See: #2984. Thanks any help is appreciated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org