ulysses-you opened a new pull request, #4349:
URL: https://github.com/apache/paimon/pull/4349

   <!-- Please specify the module before the PR name: [core] ... or [flink] ... 
-->
   
   ### Purpose
   
   <!-- Linking this pull request to the issue -->
   
   Paimon uses `.toString` to generate partition value, which is not accurate 
for some data types. like date/binary. Say, Spark engine would use a `Cast` to 
convert a partition object to string value. So this pr changes to use cast to 
generate partition value.
   
   Add a new config `partition.legacy-name` to support switch to use previous 
`toString` behavior.
   
   An example that using binary type partition column would cause failure.
   ```
   CREATE TABLE pt (
       id BIGINT,
       c1 STRING
   ) using paimon
   PARTITIONED BY (day binary);
   
   insert into table pt values(1, 'a', cast('2021' as binary));
   select * from pt;
   ```
   
   ```
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 
1) (192.168.0.102 executor driver): java.io.FileNotFoundException: File 
'warehouse/default.db/pt/day=%5BB@4a045a11/bucket-0/data-91c064a3-a0a1-4042-9d5a-cc82a23af7ff-0.parquet'
 not found, Possible causes: 1.snapshot expires too fast, you can configure 
'snapshot.time-retained' option with a larger value. 2.consumption is too slow, 
you can improve the performance of consumption (For example, increasing 
parallelism).
   ```
   
   
   <!-- What is the purpose of the change -->
   
   ### Tests
   
   <!-- List UT and IT cases to verify this change -->
   add test
   
   ### API and Format
   
   <!-- Does this change affect API or storage format -->
   no
   
   ### Documentation
   
   <!-- Does this change introduce a new feature -->
   added docs


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to