PavelkoSemen opened a new issue, #15908:
URL: https://github.com/apache/iceberg/issues/15908
### Apache Iceberg version
1.10.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
### Problem
When creating Iceberg tables through Spark with S3 location, the table
location contains double slashes (`//`) in the path. This behavior occurs
regardless of whether the table is partitioned or not. The double slash in the
path causes the `OPTIMIZE` operation to behave incorrectly and sometimes delete
files, leading to table corruption.
### Example
```sql
CREATE TABLE test.test.test (
test_rk integer
)
WITH (
compression_codec = 'ZSTD',
format = 'PARQUET',
format_version = 2,
location = 's3a://test//test_8262bea6c787' -- ⚠️ Double slash after
'test/'
)
```
### Example 2
```sql
CREATE TABLE test.test.test (
test_rk decimal(21, 0),
test real,
test_l varchar,
test_stat real,
test_id integer
)
WITH (
compression_codec = 'ZSTD',
format = 'PARQUET',
format_version = 2,
location = 's3a://test//test',
partitioning = ARRAY['test_id']
)
```
Component | Version
-- | --
iceberg-spark-runtime | 3.5_2.12-1.10.1
iceberg-aws-bundle | 1.10.1
Spark | 3.5.6
Filesystem | S3A
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [x] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]