Re: [PR] URL-encode partition field names in file locations [iceberg-python]

via GitHub Wed, 08 Jan 2025 09:25:06 -0800


smaheshwar-pltr commented on code in PR #1457:
URL: https://github.com/apache/iceberg-python/pull/1457#discussion_r1907541402



##########
tests/integration/test_partitioning_key.py:
##########
@@ -721,6 +753,27 @@
             VALUES
             (CAST('2023-01-01 11:55:59.999999' AS TIMESTAMP), 
CAST('2023-01-01' AS DATE), 'some data');
             """,
+            None,
+        ),
+        # Test that special characters are URL-encoded
+        (
+            [PartitionField(source_id=15, field_id=1001, 
transform=IdentityTransform(), name="special#string+field")],
+            ["special string"],
+            Record(**{"special#string+field": "special string"}),  # type: 
ignore
+            "special%23string%2Bfield=special+string",
+            f"""CREATE TABLE {identifier} (
+                `special#string+field` string
+            )
+            USING iceberg
+            PARTITIONED BY (
+                identity(`special#string+field`)
+            )
+            """,
+            f"""INSERT INTO {identifier}
+            VALUES
+            ('special string')
+            """,
+            lambda name: name.replace("#", "_x23").replace("+", "_x2B"),

Review Comment:
   I was conflicted about this:
   - this sanitisation felt unique to this test instance so a parameter seemed 
best
   - alternatively, given the schema with these two special characters is 
specified at the top of the file (so all the test instances of this test use 
that schema), it's reasonable to use the same sanitisation for them all. Maybe 
having it as a top-level function beside the schema definition would best 
highlight this
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] URL-encode partition field names in file locations [iceberg-python]

Reply via email to