Re: [PR] [FLINK-35097][table] Fix 'raw' format deserialization [flink]

2024-04-22 Thread via GitHub


twalthr merged PR #24661:
URL: https://github.com/apache/flink/pull/24661


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35097][table] Fix 'raw' format deserialization [flink]

2024-04-18 Thread via GitHub


twalthr commented on code in PR #24661:
URL: https://github.com/apache/flink/pull/24661#discussion_r1570166387


##
flink-table/flink-table-runtime/src/test/java/org/apache/flink/table/formats/raw/RawFormatSerDeSchemaTest.java:
##
@@ -197,12 +247,12 @@ public static TestSpec type(DataType fieldType) {
 return new TestSpec(fieldType);
 }
 
-public TestSpec value(Object value) {
-this.value = value;
+public TestSpec values(Object[] values) {

Review Comment:
   make this a var arg to avoid the need for `new X[]{}` in the test specs. 
this will improve code readibility.



##
flink-table/flink-table-runtime/src/test/java/org/apache/flink/table/formats/raw/RawFormatSerDeSchemaTest.java:
##
@@ -197,12 +247,12 @@ public static TestSpec type(DataType fieldType) {
 return new TestSpec(fieldType);
 }
 
-public TestSpec value(Object value) {
-this.value = value;
+public TestSpec values(Object[] values) {
+this.values = values;
 return this;
 }
 
-public TestSpec binary(byte[] bytes) {
+public TestSpec binary(byte[][] bytes) {

Review Comment:
   same as above, make this a var arg



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35097][table] Fix 'raw' format deserialization [flink]

2024-04-14 Thread via GitHub


kumar-mallikarjuna commented on PR #24661:
URL: https://github.com/apache/flink/pull/24661#issuecomment-2054358863

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35097][table] Fix 'raw' format deserialization [flink]

2024-04-14 Thread via GitHub


flinkbot commented on PR #24661:
URL: https://github.com/apache/flink/pull/24661#issuecomment-2054068968

   
   ## CI report:
   
   * b3c8502a9ca370eab1c023588571f6f3fc18addd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-35097][table] Fix 'raw' format deserialization [flink]

2024-04-14 Thread via GitHub


kumar-mallikarjuna opened a new pull request, #24661:
URL: https://github.com/apache/flink/pull/24661

   
   
   ## What is the purpose of the change
   
   Fixes broken deserialization of the `raw` format.
   
   ## Brief change log
   
   The existing `reuse` attribute is returned by the `deserialize()` method in 
`RawFormatDeserializationSchema`. The attribute is successively modified during 
deserialization but since it's returned by reference in the method, all the 
returned deserializations refer to a single value. This change removes the 
attributes and instead constructs the returned value in the `deserialize()` 
method inside the method itself, thus avoiding overwrites.
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   **Existing Behavior**
   1. Create a test file with the content:
   ```
   line 1
   line 2
   line3
   ```
   2. Run
   ```SQL
   CREATE TABLE MyRawTable (
   `doc` string,
   ) WITH (
   'path' = 'file:///path/to/data',
   'format' = 'raw',
   'connector' = 'filesystem'
   );
   ```
   3. Check inserted values
   ```SQL
   SELECT * FROM MyRawTable;
   ```
   ```
   doc
   -
   line 3
   line 3
   line 3
   ```
   
   **After the Change**
   ```SQL
   SELECT * FROM MyRawTable;
   ```
   ```
   doc
   -
   line 1
   line 2
   line 3
   ```
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (yes)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org