[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=411482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411482 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 27/Mar/20 23:09 Start Date: 27/Mar/20 23:09 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 411482) Time Spent: 5h 10m (was: 5h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=411390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411390 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 27/Mar/20 21:20 Start Date: 27/Mar/20 21:20 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-605320135 run sql postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 411390) Time Spent: 5h (was: 4h 50m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=411382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411382 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 27/Mar/20 21:09 Start Date: 27/Mar/20 21:09 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-605316228 run sql postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 411382) Time Spent: 4h 50m (was: 4h 40m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=411257&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411257 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 27/Mar/20 18:28 Start Date: 27/Mar/20 18:28 Worklog Time Spent: 10m Work Description: alexvanboxel commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-605200678 > @alexvanboxel I would rather not in this PR, because the RowUtils API wasn't designed for public usage. If we were to make it public, I would prefer to spend a lot more time on the API design, and I would also want to understand the use cases a bit better. Understand LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 411257) Time Spent: 4h 40m (was: 4.5h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=411235&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-411235 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 27/Mar/20 17:59 Start Date: 27/Mar/20 17:59 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-605167235 @alexvanboxel I would rather not in this PR, because the RowUtils API wasn't designed for public usage. If we were to make it public, I would prefer to spend a lot more time on the API design, and I would also want to understand the use cases a bit better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 411235) Time Spent: 4.5h (was: 4h 20m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=410711&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410711 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 26/Mar/20 23:30 Start Date: 26/Mar/20 23:30 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-604738743 run sql postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 410711) Time Spent: 4h 20m (was: 4h 10m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=410683&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410683 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 26/Mar/20 22:47 Start Date: 26/Mar/20 22:47 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-604726387 @alexvanboxel rebased and fixed bugs. Previously I was blocked on getting logical types to work, but now that we natively store logical types in Row, it's become much easier. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 410683) Time Spent: 4h 10m (was: 4h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=410079&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-410079 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 26/Mar/20 05:21 Start Date: 26/Mar/20 05:21 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-604235650 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 410079) Time Spent: 4h (was: 3h 50m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=397907&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-397907 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 04/Mar/20 21:40 Start Date: 04/Mar/20 21:40 Worklog Time Spent: 10m Work Description: reuvenlax commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-594864868 @alexvanboxel there are still some bugs in this PR related to logical types, which is why I haven't pushed it in yet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 397907) Time Spent: 3h 50m (was: 3h 40m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=397775&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-397775 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 04/Mar/20 18:59 Start Date: 04/Mar/20 18:59 Worklog Time Spent: 10m Work Description: alexvanboxel commented on issue #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#issuecomment-594753897 Maybe rebase, then we could get this in. This is a useful addition. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 397775) Time Spent: 3h 40m (was: 3.5h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=394021&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394021 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 27/Feb/20 08:14 Start Date: 27/Feb/20 08:14 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r384969152 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: I propose to take this discussion out of this PR, as I really like this already. For me it's already a LGTM. (Best to wait till the 2.20 is cut though). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 394021) Time Spent: 3.5h (was: 3h 20m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=391724&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391724 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 24/Feb/20 14:46 Start Date: 24/Feb/20 14:46 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r383306536 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: Aside from the mutability/. WDYT of the proposed `addValue`, `getValue` and `getValues` vs `add`, `get`, `detach` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 391724) Time Spent: 3h 20m (was: 3h 10m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=391466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-391466 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 24/Feb/20 01:49 Start Date: 24/Feb/20 01:49 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r383062953 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: The fact that getValues returns the underlying objects I think is in line with many other things. e.g. ImmutableList in Java still returns the underlying objects, so you can mutate them. The onus is on the user not to do that. In particular, the elements of a PCollection are not allowed to be mutated in a ParDo (and the main point of the Row object is to be used in ParDos). The DirectRunner has a mutability checker that will fail your test if you mutate the input elements. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 391466) Time Spent: 3h 10m (was: 3h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390462&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390462 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 21/Feb/20 06:56 Start Date: 21/Feb/20 06:56 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382422957 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: Elaborating on the above comment: `getValues` is currently not safe (it makes the Row mutable), if the meaning changes that it returns the logical type the implementation will need to make a copy of the content as it needs to translate the basetype to the logicaltype. As it's a copy the Row is immutable again. See this for test: ``` @Test public void testImmutability() { Schema schema = Schema.builder().addInt32Field("a").addInt32Field("b").build(); Row row1 = Row.withSchema(schema).addValue(1).addValue(2).build(); List values = row1.getValues(); values.set(1, 3); assertEquals(2,(int)row1.getValue(1)); } ``` Then the question is left for `attach` or `detach`: if you make values an immutable array it can be returned with `detach`, as it's immutable that's safe. The `attach` needs then make a copy of the array to an immutable array (or check if it's immutable) and `build` needs to make the immutable array as well. - This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390462) Time Spent: 3h (was: 2h 50m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390247&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390247 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:32 Start Date: 20/Feb/20 20:32 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382241904 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: What about this, - `addValue`, `getValue` and `getValues` return the input type(s) (so the logical type) - `add`, `get`, `detach` returns the base types (is also inline with the `attach`) That means a behaviour change only for `getValue` and `getValues` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390247) Time Spent: 2h 50m (was: 2h 40m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390216&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390216 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:06 Start Date: 20/Feb/20 20:06 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382229718 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -539,37 +568,118 @@ public String toString() { } /** - * Creates a record builder with specified {@link #getSchema()}. {@link Builder#build()} will - * throw an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not - * match the number of fields specified. + * Creates a row builder with specified {@link #getSchema()}. {@link Builder#build()} will throw + * an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not match + * the number of fields specified. If any of the arguments don't match the expected types for the + * schema fields, {@link Builder#build()} will throw a {@link ClassCastException}. */ public static Builder withSchema(Schema schema) { return new Builder(schema); } + /** + * Creates a row builder based on the specified row. Field values in the new row can be explicitly + * set using {@link ModifyingBuilder#withFieldValue}. Any values not so overridden will be the + * same as the values in the original row. + */ + public static ModifyingBuilder fromRow(Row row) { +return new ModifyingBuilder(row); + } + + /** Builder for {@link Row} that bases a row on another row. */ + public static class ModifyingBuilder { +private final Row sourceRow; +private final Map fieldValues = Maps.newHashMap(); + +private ModifyingBuilder(Row sourceRow) { + this.sourceRow = sourceRow; +} + +public Schema getSchema() { + return sourceRow.getSchema(); +} + +public ModifyingBuilder withFieldValue(String fieldName, Object value) { Review comment: added This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390216) Time Spent: 2h 40m (was: 2.5h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390215&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390215 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:06 Start Date: 20/Feb/20 20:06 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382229706 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -539,37 +568,118 @@ public String toString() { } /** - * Creates a record builder with specified {@link #getSchema()}. {@link Builder#build()} will - * throw an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not - * match the number of fields specified. + * Creates a row builder with specified {@link #getSchema()}. {@link Builder#build()} will throw + * an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not match + * the number of fields specified. If any of the arguments don't match the expected types for the + * schema fields, {@link Builder#build()} will throw a {@link ClassCastException}. */ public static Builder withSchema(Schema schema) { return new Builder(schema); } + /** + * Creates a row builder based on the specified row. Field values in the new row can be explicitly + * set using {@link ModifyingBuilder#withFieldValue}. Any values not so overridden will be the + * same as the values in the original row. + */ + public static ModifyingBuilder fromRow(Row row) { +return new ModifyingBuilder(row); + } + + /** Builder for {@link Row} that bases a row on another row. */ + public static class ModifyingBuilder { +private final Row sourceRow; +private final Map fieldValues = Maps.newHashMap(); + +private ModifyingBuilder(Row sourceRow) { + this.sourceRow = sourceRow; +} + +public Schema getSchema() { + return sourceRow.getSchema(); +} + +public ModifyingBuilder withFieldValue(String fieldName, Object value) { + FieldAccessDescriptor fieldAccessDescriptor = + FieldAccessDescriptor.withFieldNames(fieldName).resolve(getSchema()); + checkArgument(fieldAccessDescriptor.referencesSingleField(), ""); + fieldValues.put(fieldAccessDescriptor, new FieldOverride(value)); + return this; +} + +public ModifyingBuilder withFieldValues(Map values) { + fieldValues.putAll( + values.entrySet().stream() + .collect( + Collectors.toMap( + e -> FieldAccessDescriptor.withFieldNames(e.getKey()), + e -> new FieldOverride(e.getValue(); + return this; +} + +public Row build() { + Row row = + (Row) + new RowFieldMatcher() + .match( + new CapturingRowCases(getSchema(), this.fieldValues, false), + FieldType.row(getSchema()), + FieldAccessDescriptor.create(), + sourceRow); + return row; +} + } + /** Builder for {@link Row}. */ public static class Builder { +private final Map fieldValues = Maps.newHashMap(); private List values = Lists.newArrayList(); private boolean attached = false; @Nullable private Factory> fieldValueGetterFactory; @Nullable private Object getterTarget; -private Schema schema; +private final Schema schema; Builder(Schema schema) { this.schema = schema; } -public int nextFieldId() { - if (fieldValueGetterFactory != null) { -throw new RuntimeException("Not supported"); - } - return values.size(); -} - +/** Return the schema for the row being built. */ public Schema getSchema() { return schema; } +/** + * Set a field value using the field name. Nested values can be set using the field selection + * syntax. + */ +public Builder withFieldValue(String fieldName, Object value) { Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390215) Time Spent: 2.5h (was: 2h 20m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task >
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390214&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390214 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:06 Start Date: 20/Feb/20 20:06 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382229582 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -539,37 +568,118 @@ public String toString() { } /** - * Creates a record builder with specified {@link #getSchema()}. {@link Builder#build()} will - * throw an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not - * match the number of fields specified. + * Creates a row builder with specified {@link #getSchema()}. {@link Builder#build()} will throw + * an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not match + * the number of fields specified. If any of the arguments don't match the expected types for the + * schema fields, {@link Builder#build()} will throw a {@link ClassCastException}. */ public static Builder withSchema(Schema schema) { return new Builder(schema); } + /** + * Creates a row builder based on the specified row. Field values in the new row can be explicitly + * set using {@link ModifyingBuilder#withFieldValue}. Any values not so overridden will be the + * same as the values in the original row. + */ + public static ModifyingBuilder fromRow(Row row) { +return new ModifyingBuilder(row); + } + + /** Builder for {@link Row} that bases a row on another row. */ + public static class ModifyingBuilder { Review comment: Yes, good suggestion. This should simplify the code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390214) Time Spent: 2h 20m (was: 2h 10m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390212&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390212 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:05 Start Date: 20/Feb/20 20:05 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382229033 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -599,272 +709,555 @@ public Builder addArray(Object... values) { return this; } -// Values are attached. No verification is done, and no conversions are done. LogicalType -// values must be specified as the base type. +// Values are attached. No verification is done, and no conversions are done. LogicalType values +// must be specified +// as the base type. This method should be used with great care, as no validation is done. If +// incorrect values are +// passed in, it could result in strange errors later in the pipeline. This method is largely +// used internal +// to Beam. +@Internal public Builder attachValues(List values) { Review comment: I like making attachValues return a Row. I thin the same for withFieldValueBuilders. this simplifies the builder code This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390212) Time Spent: 2h 10m (was: 2h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390211&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390211 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:04 Start Date: 20/Feb/20 20:04 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382228952 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: still struggling to figure out a good answer to the API. getValues is supposed to return the base value so that transforms that don't know about the specific logical value can process it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390211) Time Spent: 2h (was: 1h 50m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=390206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-390206 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 20/Feb/20 20:01 Start Date: 20/Feb/20 20:01 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r382227294 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: added test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 390206) Time Spent: 1h 50m (was: 1h 40m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=389364&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389364 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 19/Feb/20 09:13 Start Date: 19/Feb/20 09:13 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r381160859 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: What get's people confused on the current API is that you can't loop over the values of a Row with `getValue()` and can't supply them to the Builder as the `addValue()` expects the input type, while the getValue returns a base type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389364) Time Spent: 1h 40m (was: 1.5h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=389360&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389360 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 19/Feb/20 09:08 Start Date: 19/Feb/20 09:08 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r381158593 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -55,8 +59,31 @@ * {@link Row} is an immutable tuple-like schema to represent one element in a {@link PCollection}. * The fields are described with a {@link Schema}. * - * {@link Schema} contains the names for each field and the coder for the whole record, - * {see @link Schema#getRowCoder()}. + * {@link Schema} contains the names and types for each field. + * + * There are several ways to build a new Row object. To build a row from scratch using a schema + * object, {@link Row#withSchema} can be used. Schema fields can be specified by name, and nested + * fields can be specified using the field selection syntax. For example: + * + * {@code + * Row row = Row.withSchema(schema) + * .withFieldValue("userId", "user1) + * .withFieldValue("location.city", "seattle") + * .withFieldValue("location.state", "wa") + * .build(); + * } + * + * The {@link Row#fromRow} builder can be used to base a row off of another row. The builder can + * be used to specify values for specific fields, and all the remaining values will be taken from + * the original row. For example, the following produces a row identical to the above row except for + * the location.city field. + * + * {@code + * Row modifiedRow = + * Row.fromRow(row) + *.withFieldValue("location.city", "tacoma") Review comment: Nope, I like the proposed `withFieldValue`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389360) Time Spent: 1.5h (was: 1h 20m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=389356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389356 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 19/Feb/20 09:07 Start Date: 19/Feb/20 09:07 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r381147867 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -599,272 +709,555 @@ public Builder addArray(Object... values) { return this; } -// Values are attached. No verification is done, and no conversions are done. LogicalType -// values must be specified as the base type. +// Values are attached. No verification is done, and no conversions are done. LogicalType values +// must be specified +// as the base type. This method should be used with great care, as no validation is done. If +// incorrect values are +// passed in, it could result in strange errors later in the pipeline. This method is largely +// used internal +// to Beam. +@Internal public Builder attachValues(List values) { Review comment: Maybe this is an opportunity to change the attachValues. No values should be set before or after the attach. I see 2 options to improve this: - In the attach first see if values are already set. Let the attachValues return the new Row directly. This is maybe a bit strange as it violates a builder pattern. - Have 4 build in builders. The starting one (that includes an` attachValues`, `add` and `withFieldValue`), all of them return a specific builder: the new `ModifyingBuilder` and a new `AddValuesBuilder` that only has the add methods and an `AttachBuilder` that only has build. This also eliminates some elaborate if/then/else's in the builder(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389356) Time Spent: 1h (was: 50m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=389358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389358 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 19/Feb/20 09:07 Start Date: 19/Feb/20 09:07 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r381157953 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -539,37 +568,118 @@ public String toString() { } /** - * Creates a record builder with specified {@link #getSchema()}. {@link Builder#build()} will - * throw an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not - * match the number of fields specified. + * Creates a row builder with specified {@link #getSchema()}. {@link Builder#build()} will throw + * an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not match + * the number of fields specified. If any of the arguments don't match the expected types for the + * schema fields, {@link Builder#build()} will throw a {@link ClassCastException}. */ public static Builder withSchema(Schema schema) { return new Builder(schema); } + /** + * Creates a row builder based on the specified row. Field values in the new row can be explicitly + * set using {@link ModifyingBuilder#withFieldValue}. Any values not so overridden will be the + * same as the values in the original row. + */ + public static ModifyingBuilder fromRow(Row row) { +return new ModifyingBuilder(row); + } + + /** Builder for {@link Row} that bases a row on another row. */ + public static class ModifyingBuilder { +private final Row sourceRow; +private final Map fieldValues = Maps.newHashMap(); + +private ModifyingBuilder(Row sourceRow) { + this.sourceRow = sourceRow; +} + +public Schema getSchema() { + return sourceRow.getSchema(); +} + +public ModifyingBuilder withFieldValue(String fieldName, Object value) { Review comment: Wouldn't it be useful to have a `withFieldValue` with an index? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389358) Time Spent: 1h 10m (was: 1h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=389357&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389357 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 19/Feb/20 09:07 Start Date: 19/Feb/20 09:07 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r381152948 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -539,37 +568,118 @@ public String toString() { } /** - * Creates a record builder with specified {@link #getSchema()}. {@link Builder#build()} will - * throw an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not - * match the number of fields specified. + * Creates a row builder with specified {@link #getSchema()}. {@link Builder#build()} will throw + * an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not match + * the number of fields specified. If any of the arguments don't match the expected types for the + * schema fields, {@link Builder#build()} will throw a {@link ClassCastException}. */ public static Builder withSchema(Schema schema) { return new Builder(schema); } + /** + * Creates a row builder based on the specified row. Field values in the new row can be explicitly + * set using {@link ModifyingBuilder#withFieldValue}. Any values not so overridden will be the + * same as the values in the original row. + */ + public static ModifyingBuilder fromRow(Row row) { +return new ModifyingBuilder(row); + } + + /** Builder for {@link Row} that bases a row on another row. */ + public static class ModifyingBuilder { +private final Row sourceRow; +private final Map fieldValues = Maps.newHashMap(); + +private ModifyingBuilder(Row sourceRow) { + this.sourceRow = sourceRow; +} + +public Schema getSchema() { + return sourceRow.getSchema(); +} + +public ModifyingBuilder withFieldValue(String fieldName, Object value) { + FieldAccessDescriptor fieldAccessDescriptor = + FieldAccessDescriptor.withFieldNames(fieldName).resolve(getSchema()); + checkArgument(fieldAccessDescriptor.referencesSingleField(), ""); + fieldValues.put(fieldAccessDescriptor, new FieldOverride(value)); + return this; +} + +public ModifyingBuilder withFieldValues(Map values) { + fieldValues.putAll( + values.entrySet().stream() + .collect( + Collectors.toMap( + e -> FieldAccessDescriptor.withFieldNames(e.getKey()), + e -> new FieldOverride(e.getValue(); + return this; +} + +public Row build() { + Row row = + (Row) + new RowFieldMatcher() + .match( + new CapturingRowCases(getSchema(), this.fieldValues, false), + FieldType.row(getSchema()), + FieldAccessDescriptor.create(), + sourceRow); + return row; +} + } + /** Builder for {@link Row}. */ public static class Builder { +private final Map fieldValues = Maps.newHashMap(); private List values = Lists.newArrayList(); private boolean attached = false; @Nullable private Factory> fieldValueGetterFactory; @Nullable private Object getterTarget; -private Schema schema; +private final Schema schema; Builder(Schema schema) { this.schema = schema; } -public int nextFieldId() { - if (fieldValueGetterFactory != null) { -throw new RuntimeException("Not supported"); - } - return values.size(); -} - +/** Return the schema for the row being built. */ public Schema getSchema() { return schema; } +/** + * Set a field value using the field name. Nested values can be set using the field selection + * syntax. + */ +public Builder withFieldValue(String fieldName, Object value) { Review comment: Maybe this one can return the ModifyingBuilder so that no other methods can used (no attachValues, no add's). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389357) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/B
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=389359&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-389359 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 19/Feb/20 09:07 Start Date: 19/Feb/20 09:07 Worklog Time Spent: 10m Work Description: alexvanboxel commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r381151951 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -539,37 +568,118 @@ public String toString() { } /** - * Creates a record builder with specified {@link #getSchema()}. {@link Builder#build()} will - * throw an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not - * match the number of fields specified. + * Creates a row builder with specified {@link #getSchema()}. {@link Builder#build()} will throw + * an {@link IllegalArgumentException} if number of fields in {@link #getSchema()} does not match + * the number of fields specified. If any of the arguments don't match the expected types for the + * schema fields, {@link Builder#build()} will throw a {@link ClassCastException}. */ public static Builder withSchema(Schema schema) { return new Builder(schema); } + /** + * Creates a row builder based on the specified row. Field values in the new row can be explicitly + * set using {@link ModifyingBuilder#withFieldValue}. Any values not so overridden will be the + * same as the values in the original row. + */ + public static ModifyingBuilder fromRow(Row row) { +return new ModifyingBuilder(row); + } + + /** Builder for {@link Row} that bases a row on another row. */ + public static class ModifyingBuilder { Review comment: Can't this be tweaked a bit, that this is the builder specifically for use with `withFieldValue`. Meaning that if when nit doesn't have a source row it just assumes null values for the fields not set. See remark on `withFieldValue` on initial builder as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 389359) Time Spent: 1h 20m (was: 1h 10m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=388706&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-388706 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 18/Feb/20 08:32 Start Date: 18/Feb/20 08:32 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r380521052 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -55,8 +59,31 @@ * {@link Row} is an immutable tuple-like schema to represent one element in a {@link PCollection}. * The fields are described with a {@link Schema}. * - * {@link Schema} contains the names for each field and the coder for the whole record, - * {see @link Schema#getRowCoder()}. + * {@link Schema} contains the names and types for each field. + * + * There are several ways to build a new Row object. To build a row from scratch using a schema + * object, {@link Row#withSchema} can be used. Schema fields can be specified by name, and nested + * fields can be specified using the field selection syntax. For example: + * + * {@code + * Row row = Row.withSchema(schema) + * .withFieldValue("userId", "user1) + * .withFieldValue("location.city", "seattle") + * .withFieldValue("location.state", "wa") + * .build(); + * } + * + * The {@link Row#fromRow} builder can be used to base a row off of another row. The builder can + * be used to specify values for specific fields, and all the remaining values will be taken from + * the original row. For example, the following produces a row identical to the above row except for + * the location.city field. + * + * {@code + * Row modifiedRow = + * Row.fromRow(row) + *.withFieldValue("location.city", "tacoma") Review comment: I don't think so. We're not really adding anything, we're setting something. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 388706) Time Spent: 40m (was: 0.5h) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=388707&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-388707 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 18/Feb/20 08:32 Start Date: 18/Feb/20 08:32 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r380521146 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: yes will add one This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 388707) Time Spent: 50m (was: 40m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=388634&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-388634 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 18/Feb/20 02:11 Start Date: 18/Feb/20 02:11 Worklog Time Spent: 10m Work Description: rezarokni commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r380424138 ## File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java ## @@ -477,6 +477,172 @@ public void testCreateMapWithRowValue() { assertEquals(data, row.getMap("map")); } + @Test Review comment: Is it worth having a logical type to test as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 388634) Time Spent: 0.5h (was: 20m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=388627&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-388627 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 18/Feb/20 02:01 Start Date: 18/Feb/20 02:01 Worklog Time Spent: 10m Work Description: rezarokni commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883#discussion_r380422459 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java ## @@ -55,8 +59,31 @@ * {@link Row} is an immutable tuple-like schema to represent one element in a {@link PCollection}. * The fields are described with a {@link Schema}. * - * {@link Schema} contains the names for each field and the coder for the whole record, - * {see @link Schema#getRowCoder()}. + * {@link Schema} contains the names and types for each field. + * + * There are several ways to build a new Row object. To build a row from scratch using a schema + * object, {@link Row#withSchema} can be used. Schema fields can be specified by name, and nested + * fields can be specified using the field selection syntax. For example: + * + * {@code + * Row row = Row.withSchema(schema) + * .withFieldValue("userId", "user1) + * .withFieldValue("location.city", "seattle") + * .withFieldValue("location.state", "wa") + * .build(); + * } + * + * The {@link Row#fromRow} builder can be used to base a row off of another row. The builder can + * be used to specify values for specific fields, and all the remaining values will be taken from + * the original row. For example, the following produces a row identical to the above row except for + * the location.city field. + * + * {@code + * Row modifiedRow = + * Row.fromRow(row) + *.withFieldValue("location.city", "tacoma") Review comment: Would this be more readable if it was withAddedFieldValue? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 388627) Time Spent: 20m (was: 10m) > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-9331) The Row object needs better builders
[ https://issues.apache.org/jira/browse/BEAM-9331?focusedWorklogId=388562&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-388562 ] ASF GitHub Bot logged work on BEAM-9331: Author: ASF GitHub Bot Created on: 17/Feb/20 18:35 Start Date: 17/Feb/20 18:35 Worklog Time Spent: 10m Work Description: reuvenlax commented on pull request #10883: [BEAM-9331] Add better Row builders URL: https://github.com/apache/beam/pull/10883 This PR adds two builders to the Row object. The first allows building a Row by specifying fields by name: Row row = Row.withSchema(schema) .withFieldValue("userId", "user1) .withFieldValue("location.city", "seattle") .withFieldValue("location.state", "wa") .build(); The second allows building. a Row based on a previous row by specifying only the fields to change: Row modifiedRow = Row.fromRow(row) .withFieldValue("location.city", "tacoma") .build(); R: @rezarokni This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 388562) Remaining Estimate: 0h Time Spent: 10m > The Row object needs better builders > > > Key: BEAM-9331 > URL: https://issues.apache.org/jira/browse/BEAM-9331 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-core >Reporter: Reuven Lax >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Users should be able to build a Row object by specifying field names. Desired > syntax: > > Row.withSchema(schema) > .withFieldName("field1", "value) > .withFieldName("field2.field3", value) > .build() > > Users should also have a builder that allows taking an existing row and > changing specific fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)