[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=116023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116023 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 26/Jun/18 16:41 Start Date: 26/Jun/18 16:41 Worklog Time Spent: 10m Work Description: kennknowles closed pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java index 8e32a6aa2de..015e8711753 100644 --- a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java +++ b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java @@ -29,6 +29,7 @@ import javax.annotation.Nullable; import org.apache.beam.runners.direct.DirectOptions; import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.Pipeline.PipelineVisitor; import org.apache.beam.sdk.PipelineResult; import org.apache.beam.sdk.PipelineResult.State; import org.apache.beam.sdk.coders.VarIntCoder; @@ -40,12 +41,16 @@ import org.apache.beam.sdk.options.ApplicationNameOptions; import org.apache.beam.sdk.options.PipelineOptions; import org.apache.beam.sdk.options.PipelineOptionsFactory; +import org.apache.beam.sdk.runners.TransformHierarchy.Node; import org.apache.beam.sdk.state.StateSpec; import org.apache.beam.sdk.state.StateSpecs; import org.apache.beam.sdk.state.ValueState; import org.apache.beam.sdk.transforms.DoFn; import org.apache.beam.sdk.transforms.ParDo; import org.apache.beam.sdk.values.KV; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollection.IsBounded; +import org.apache.beam.sdk.values.PValue; import org.apache.beam.sdk.values.Row; import org.apache.calcite.adapter.enumerable.EnumerableRel; import org.apache.calcite.adapter.enumerable.EnumerableRelImplementor; @@ -152,15 +157,6 @@ public boolean isReached() { } } - private static PipelineResult run( - PipelineOptions options, BeamRelNode node, DoFn doFn) { -Pipeline pipeline = Pipeline.create(options); -BeamSqlRelUtils.toPCollection(pipeline, node).apply(ParDo.of(doFn)); -PipelineResult result = pipeline.run(); -result.waitUntilFinish(); -return result; - } - private static PipelineResult limitRun( PipelineOptions options, BeamRelNode node, @@ -209,7 +205,12 @@ private static PipelineResult limitRun( "SELECT without INSERT is only supported in DirectRunner in SQL Shell."); Collector.globalValues.put(id, values); -run(options, node, new Collector()); + +Pipeline pipeline = Pipeline.create(options); +BeamSqlRelUtils.toPCollection(pipeline, node).apply(ParDo.of(new Collector())); +PipelineResult result = pipeline.run(); +result.waitUntilFinish(); + Collector.globalValues.remove(id); return Linq4j.asEnumerable(values); @@ -324,15 +325,22 @@ public void processElement(ProcessContext context) { } private static Enumerable count(PipelineOptions options, BeamRelNode node) { -PipelineResult result = run(options, node, new RowCounter()); -MetricQueryResults metrics = -result -.metrics() -.queryMetrics( -MetricsFilter.builder() - .addNameFilter(MetricNameFilter.named(BeamEnumerableConverter.class, "rows")) -.build()); -long count = metrics.getCounters().iterator().next().getAttempted(); +Pipeline pipeline = Pipeline.create(options); +BeamSqlRelUtils.toPCollection(pipeline, node).apply(ParDo.of(new RowCounter())); +PipelineResult result = pipeline.run(); + +long count = 0; +if (!containsUnboundedPCollection(pipeline)) { + result.waitUntilFinish(); + MetricQueryResults metrics = + result + .metrics() + .queryMetrics( + MetricsFilter.builder() + .addNameFilter(MetricNameFilter.named(BeamEnumerableConverter.class, "rows")) + .build()); + count = metrics.getCounters().iterator().next().getAttempted(); +} return Linq4j.singletonEnumerable(count); } @@ -360,4 +368,21 @@ private static int getLimitCount(BeamRelNode node) { throw new RuntimeException( "Cannot get limit count
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=116014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-116014 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 26/Jun/18 16:11 Start Date: 26/Jun/18 16:11 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-400370366 This is green. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 116014) Time Spent: 5h 50m (was: 5h 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=114611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-114611 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 22/Jun/18 04:18 Start Date: 22/Jun/18 04:18 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-399316073 LGTM and I think the quota issues are resolved. I'm taking advantage of "allow edits by maintainers" to fix this and that. Your actual code has been g2g the whole time. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 114611) Time Spent: 5h 40m (was: 5.5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113918 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 20:17 Start Date: 20/Jun/18 20:17 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-398882502 Flap in `JmsIOTest` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113918) Time Spent: 5h 20m (was: 5h 10m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 5h 20m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113919 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 20:17 Start Date: 20/Jun/18 20:17 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-398882536 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113919) Time Spent: 5.5h (was: 5h 20m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 5.5h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113534=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113534 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 03:19 Start Date: 20/Jun/18 03:19 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-398611732 Self conflict :-p This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113534) Time Spent: 5h 10m (was: 5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113533=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113533 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 03:18 Start Date: 20/Jun/18 03:18 Worklog Time Spent: 10m Work Description: kennknowles closed pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlCli.java b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlCli.java index bb06d72a33c..eb32d2a9a36 100644 --- a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlCli.java +++ b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlCli.java @@ -22,9 +22,9 @@ import org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv; import org.apache.beam.sdk.extensions.sql.impl.ParseException; import org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils; import org.apache.beam.sdk.extensions.sql.meta.store.MetaStore; import org.apache.beam.sdk.options.PipelineOptions; -import org.apache.beam.sdk.values.PCollectionTuple; /** {@link BeamSqlCli} provides methods to execute Beam SQL with an interactive client. */ @Experimental @@ -59,7 +59,7 @@ public void execute(String sqlString) throws ParseException { BeamEnumerableConverter.createPipelineOptions(env.getPipelineOptions()); options.setJobName("BeamPlanCreator"); Pipeline pipeline = Pipeline.create(options); - PCollectionTuple.empty(pipeline).apply(env.parseQuery(sqlString)); + BeamSqlRelUtils.toPCollection(pipeline, env.parseQuery(sqlString)); pipeline.run(); } } diff --git a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlTable.java b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlTable.java index 8bfecd5381e..7f849d4e70f 100644 --- a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlTable.java +++ b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlTable.java @@ -18,20 +18,19 @@ package org.apache.beam.sdk.extensions.sql; -import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.schemas.Schema; -import org.apache.beam.sdk.transforms.PTransform; +import org.apache.beam.sdk.values.PBegin; import org.apache.beam.sdk.values.PCollection; import org.apache.beam.sdk.values.POutput; import org.apache.beam.sdk.values.Row; /** This interface defines a Beam Sql Table. */ public interface BeamSqlTable { - /** create a {@code PCollection} from source. */ - PCollection buildIOReader(Pipeline pipeline); + /** create a {@code PCollection} from source. */ + PCollection buildIOReader(PBegin begin); /** create a {@code IO.write()} instance to write to target. */ - PTransform, POutput> buildIOWriter(); + POutput buildIOWriter(PCollection input); /** Get the schema info of the table. */ Schema getSchema(); diff --git a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/SqlTransform.java b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/SqlTransform.java index 20d152b67d2..2bee537f90d 100644 --- a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/SqlTransform.java +++ b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/SqlTransform.java @@ -28,6 +28,7 @@ import java.util.Map; import org.apache.beam.sdk.annotations.Experimental; import org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamSqlRelUtils; import org.apache.beam.sdk.extensions.sql.impl.schema.BeamPCollectionTable; import org.apache.beam.sdk.transforms.Combine; import org.apache.beam.sdk.transforms.PTransform; @@ -92,7 +93,7 @@ registerFunctions(sqlEnv); -return PCollectionTuple.empty(input.getPipeline()).apply(sqlEnv.parseQuery(queryString())); +return BeamSqlRelUtils.toPCollection(input.getPipeline(), sqlEnv.parseQuery(queryString())); } private Map toTableMap(PInput inputs) { diff --git a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java index 940f0483464..1aca83bac0b 100644
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113477 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 00:42 Start Date: 20/Jun/18 00:42 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398588386 ElasticsearchIO failure. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113477) Time Spent: 4h 40m (was: 4.5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113478 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 00:42 Start Date: 20/Jun/18 00:42 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398588403 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113478) Time Spent: 4h 50m (was: 4h 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113475 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 00:41 Start Date: 20/Jun/18 00:41 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-398588343 ElasticsearchIO failure. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113475) Time Spent: 4h 20m (was: 4h 10m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113476 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 20/Jun/18 00:41 Start Date: 20/Jun/18 00:41 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-398588355 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113476) Time Spent: 4.5h (was: 4h 20m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113425 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 23:01 Start Date: 19/Jun/18 23:01 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196604956 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: Makes sense. Looks good This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113425) Time Spent: 4h 10m (was: 4h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113423=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113423 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 22:53 Start Date: 19/Jun/18 22:53 Worklog Time Spent: 10m Work Description: akedin commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398571079 lgtm as well This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113423) Time Spent: 4h (was: 3h 50m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113406=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113406 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 22:03 Start Date: 19/Jun/18 22:03 Worklog Time Spent: 10m Work Description: apilloud commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196593309 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: pipelineResult.getState doesn't communicate if a pipeline is expected to terminate. It has the same problems as null. I've eliminated the common run function so I will have access to everything I need to make the right decision. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113406) Time Spent: 3h 50m (was: 3h 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113387=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113387 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 21:27 Start Date: 19/Jun/18 21:27 Worklog Time Spent: 10m Work Description: kennknowles commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398551822 Yea, ignore my comments. This is 80% of the way there and useful as-is. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113387) Time Spent: 3h 40m (was: 3.5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113380 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 21:09 Start Date: 19/Jun/18 21:09 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196579158 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: How about not blocking here and just in `count` doing `if (containsUnboundedPCollections(pipeline)) { return 0; } else { block }`. And you might want to comment next to that with a JIRA about finding a good way to send a warning / message to the user. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113380) Time Spent: 3.5h (was: 3h 20m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113379 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 21:08 Start Date: 19/Jun/18 21:08 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196579158 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: How about just in `count` doing `if (containsUnboundedPCollections(pipeline)) { return 0; }`. And you might want to comment next to that with a JIRA about finding a good way to send a warning / message to the user. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113379) Time Spent: 3h 20m (was: 3h 10m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113378 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 21:05 Start Date: 19/Jun/18 21:05 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196578271 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: nitpicking further, `null` doesn't convey any meaningful information in this context unless you know all the code. Does `null` mean "pipeline is still running but we're not waiting" or "we didn't even run the pipeline because some reasons", or "something is wrong, go crash"? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113378) Time Spent: 3h 10m (was: 3h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113377 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 20:57 Start Date: 19/Jun/18 20:57 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196575927 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: But there's `pipelineResult.getState()` which you can check whether it is done yet, can't it be used for the same purpose? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113377) Time Spent: 3h (was: 2h 50m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113366 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 20:45 Start Date: 19/Jun/18 20:45 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#discussion_r196571844 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamRelNode.java ## @@ -17,17 +17,34 @@ */ package org.apache.beam.sdk.extensions.sql.impl.rel; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.transforms.PTransform; import org.apache.beam.sdk.values.PCollection; -import org.apache.beam.sdk.values.PCollectionTuple; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.PInput; import org.apache.beam.sdk.values.Row; import org.apache.calcite.rel.RelNode; /** A {@link RelNode} that can also give a {@link PTransform} that implements the expression. */ public interface BeamRelNode extends RelNode { - /** - * A {@link BeamRelNode} is a recursive structure, the {@code BeamQueryPlanner} visits it with a - * DFS(Depth-First-Search) algorithm. - */ - PTransform> toPTransform(); + /** Transforms the inputs into a PInput. */ + default PInput buildPInput(Pipeline pipeline, Map> cache) { +List inputs = getInputs(); +if (inputs.size() == 0) { + return pipeline.begin(); +} +List> pInputs = new ArrayList(inputs.size()); +for (RelNode input : inputs) { + pInputs.add(BeamSqlRelUtils.toPCollection(pipeline, (BeamRelNode) input, cache)); +} +if (pInputs.size() == 1) { + return pInputs.get(0); +} +return PCollectionList.of(pInputs); + } + + PTransform> buildPTransform(); Review comment: I think you made a very good case that the prior API is not quite right. Something bugs me about this proposed API too. Here's where I'm coming from now: 1. `BeamRelNode` instance corresponds to a `PCollection` instance (committed to particular inputs) 2. `BeamRelNode` type (plus maybe some side conditions) corresponds to a not-yet-applied `PTransform` (not yet committed to particular inputs) So I was expecting this PR to be a return to 1 with a single method `BeamRelNode.toPCollection`, or some such. The difference in failure modes is: - `toPTransform`: PTransform that ignores its inputs - `toPCollection`: no PTransform encapsulating the rel's logic (could also have a PTransform that ignores its inputs) To ensure the first, you want whoever is responsible for mapping to a `PTransform` to have no access to the `Rel` instance. To ensure the second, you want to make sure the thing building the `PCollection` is obligated to just pass its (recursively converted) inputs to some `PTransform` without any other ad hoc logic. Here's an idea: 1. `BeamRelNode.toPCollection` that does ad hoc logic on its recursively computed inputs, not worrying about a `PTransform`. 2. `BeamSqlRelUtils` that does the recursive traversal and calls each `toPCollection` from within a `PTransform.expand` that it makes right on the spot. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113366) Time Spent: 2h 50m (was: 2h 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113367 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 20:45 Start Date: 19/Jun/18 20:45 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#discussion_r196558647 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamRelNode.java ## @@ -17,18 +17,35 @@ */ package org.apache.beam.sdk.extensions.sql.impl.rel; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import org.apache.beam.sdk.Pipeline; import org.apache.beam.sdk.transforms.PTransform; import org.apache.beam.sdk.values.PCollection; -import org.apache.beam.sdk.values.PCollectionTuple; +import org.apache.beam.sdk.values.PCollectionList; +import org.apache.beam.sdk.values.PInput; import org.apache.beam.sdk.values.Row; import org.apache.calcite.rel.RelNode; /** A {@link RelNode} that can also give a {@link PTransform} that implements the expression. */ public interface BeamRelNode extends RelNode { - /** - * A {@link BeamRelNode} is a recursive structure, the {@code BeamQueryPlanner} visits it with a - * DFS(Depth-First-Search) algorithm. - */ - PTransform> toPTransform(); + /** Transforms the inputs into a PInput. */ + default PInput buildPInput(Pipeline pipeline, Map> cache) { +List inputs = getInputs(); +if (inputs.size() == 0) { Review comment: I want to highlight that at the core model level, everything is a `PCollectionTuple`, more or less. Every `PInput.expand()` method converts to such a structure. So this is just an empty tuple. I think it is safe to keep the input type `PCollectionTuple`. Seems like it ought to simplify things? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113367) Time Spent: 2h 50m (was: 2h 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113353=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113353 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 20:21 Start Date: 19/Jun/18 20:21 Worklog Time Spent: 10m Work Description: apilloud commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196564770 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: https://github.com/apache/beam/pull/5687/files#diff-4cc8078360c76fc39f49629dd6a630c2R184 Because if I did that I couldn't use null to convey a bit of information and I'd have to wrap this in a new class. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113353) Time Spent: 2h 40m (was: 2.5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113350 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 20:11 Start Date: 19/Jun/18 20:11 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#discussion_r196558858 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java ## @@ -102,15 +107,37 @@ public Result implement(EnumerableRelImplementor implementor, Prefer prefer) { } } - private static PipelineResult run( + private static @Nullable PipelineResult run( PipelineOptions options, BeamRelNode node, DoFn doFn) { Pipeline pipeline = Pipeline.create(options); PCollectionTuple.empty(pipeline).apply(node.toPTransform()).apply(ParDo.of(doFn)); PipelineResult result = pipeline.run(); + +if (containsUnboundedPCollection(pipeline)) { + return null; Review comment: Nit: Why not return the result anyway? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113350) Time Spent: 2.5h (was: 2h 20m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113349 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 20:10 Start Date: 19/Jun/18 20:10 Worklog Time Spent: 10m Work Description: apilloud commented on a change in pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#discussion_r196561519 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java ## @@ -102,10 +102,9 @@ public void registerUdaf(String functionName, Combine.CombineFn combineFn) { defaultSchema.add(functionName, new UdafImpl(combineFn)); } - public PTransform> parseQuery(String query) - throws ParseException { + public PCollection parseQuery(Pipeline pipeline, String query) throws ParseException { try { - return planner.convertToBeamRel(query).toPTransform(); + return BeamSqlRelUtils.toPCollection(pipeline, planner.convertToBeamRel(query)); Review comment: It makes the API a bit clunkier (all uses call both functions back to back) but I refactored to split the two operations. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113349) Time Spent: 2h 20m (was: 2h 10m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113345=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113345 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 19:57 Start Date: 19/Jun/18 19:57 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#discussion_r196557703 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java ## @@ -102,10 +102,9 @@ public void registerUdaf(String functionName, Combine.CombineFn combineFn) { defaultSchema.add(functionName, new UdafImpl(combineFn)); } - public PTransform> parseQuery(String query) - throws ParseException { + public PCollection parseQuery(Pipeline pipeline, String query) throws ParseException { try { - return planner.convertToBeamRel(query).toPTransform(); + return BeamSqlRelUtils.toPCollection(pipeline, planner.convertToBeamRel(query)); Review comment: Yes, I meant BeamSqlEnv.parseQuery(). My GitHub highlighting skills are bad This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113345) Time Spent: 2h 10m (was: 2h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113344 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 19:55 Start Date: 19/Jun/18 19:55 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687#issuecomment-398524785 R: @kennknowles CC: @akedin @amaliujia This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113344) Time Spent: 2h (was: 1h 50m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113341=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113341 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 19:53 Start Date: 19/Jun/18 19:53 Worklog Time Spent: 10m Work Description: apilloud opened a new pull request #5687: [BEAM-4575][SQL] Don't wait on Unbounded PCollections URL: https://github.com/apache/beam/pull/5687 Output looks like this: ``` 0: jdbc:beam:> insert into test select max(test2.payload.id) from test2 GROUP BY HOP(event_timestamp, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE); INFO: To access the Dataflow monitoring console, please navigate to https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-06-19_12_45_14-1583809748020308674?project=google.com%3Adeft-testing-integration Submitted job: 2018-06-19_12_45_14-1583809748020308674 Jun 19, 2018 12:45:14 PM org.apache.beam.runners.dataflow.DataflowRunner run INFO: To cancel the job using the 'gcloud' tool, run: > gcloud dataflow jobs --project=google.com:deft-testing-integration cancel --region=us-central1 2018-06-19_12_45_14-1583809748020308674 No rows affected (18.705 seconds) 0: jdbc:beam:> ``` Follow this checklist to help us incorporate your contribution quickly and easily: - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). It will help us expedite review of your Pull Request if you tag someone (e.g. `@username`) to look at it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113341) Time Spent: 1h 50m (was: 1h 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113339 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 19:52 Start Date: 19/Jun/18 19:52 Worklog Time Spent: 10m Work Description: apilloud commented on a change in pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#discussion_r196556095 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java ## @@ -102,10 +102,9 @@ public void registerUdaf(String functionName, Combine.CombineFn combineFn) { defaultSchema.add(functionName, new UdafImpl(combineFn)); } - public PTransform> parseQuery(String query) - throws ParseException { + public PCollection parseQuery(Pipeline pipeline, String query) throws ParseException { try { - return planner.convertToBeamRel(query).toPTransform(); + return BeamSqlRelUtils.toPCollection(pipeline, planner.convertToBeamRel(query)); Review comment: This is not the parser, this is the function to convert from a BeamRelNode to a Beam Pipeline. I think you are intending to comment 3 or 4 lines up? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113339) Time Spent: 1h 40m (was: 1.5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113318 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 18:46 Start Date: 19/Jun/18 18:46 Worklog Time Spent: 10m Work Description: akedin commented on a change in pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#discussion_r196538143 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java ## @@ -102,10 +102,9 @@ public void registerUdaf(String functionName, Combine.CombineFn combineFn) { defaultSchema.add(functionName, new UdafImpl(combineFn)); } - public PTransform> parseQuery(String query) - throws ParseException { + public PCollection parseQuery(Pipeline pipeline, String query) throws ParseException { try { - return planner.convertToBeamRel(query).toPTransform(); + return BeamSqlRelUtils.toPCollection(pipeline, planner.convertToBeamRel(query)); Review comment: Why does parser need to know about the pipeline? Can this logic happen outside of `BeamSqlEnv`? I would rather have another abstraction returned from `parseQuery` instead of passing a mutable global state into some blackbox. For example, I am thinking we can return something like `SqlTransform` here, if we could construct it with an instance of current `BeamSqlEnv` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113318) Time Spent: 1.5h (was: 1h 20m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113258 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 17:14 Start Date: 19/Jun/18 17:14 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398476216 Seems like checks were not triggered? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113258) Time Spent: 1h 20m (was: 1h 10m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113250 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 16:56 Start Date: 19/Jun/18 16:56 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398470312 R: @kennknowles cc: @akedin @amaliujia @XuMingmin @xumingming This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113250) Time Spent: 1h 10m (was: 1h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113241 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 16:54 Start Date: 19/Jun/18 16:54 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398469948 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113241) Time Spent: 1h (was: 50m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=113239=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-113239 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 19/Jun/18 16:48 Start Date: 19/Jun/18 16:48 Worklog Time Spent: 10m Work Description: apilloud removed a comment on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398231355 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 113239) Time Spent: 50m (was: 40m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=112952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112952 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 18/Jun/18 23:56 Start Date: 18/Jun/18 23:56 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398231355 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112952) Time Spent: 40m (was: 0.5h) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=112951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112951 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 18/Jun/18 23:56 Start Date: 18/Jun/18 23:56 Worklog Time Spent: 10m Work Description: apilloud removed a comment on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398198084 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112951) Time Spent: 0.5h (was: 20m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=112915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112915 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 18/Jun/18 21:14 Start Date: 18/Jun/18 21:14 Worklog Time Spent: 10m Work Description: apilloud commented on issue #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673#issuecomment-398198084 run java postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112915) Time Spent: 20m (was: 10m) > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4575) Beam SQL should cleanly transform graph from Calcite
[ https://issues.apache.org/jira/browse/BEAM-4575?focusedWorklogId=112899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-112899 ] ASF GitHub Bot logged work on BEAM-4575: Author: ASF GitHub Bot Created on: 18/Jun/18 20:31 Start Date: 18/Jun/18 20:31 Worklog Time Spent: 10m Work Description: apilloud opened a new pull request #5673: [BEAM-4575] Cleanly transform graph from Calcite to Beam SQL URL: https://github.com/apache/beam/pull/5673 We should transform the Calcite Rel graph into a Beam pipeline graph with a clean mapping between the two. Follow this checklist to help us incorporate your contribution quickly and easily: - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). It will help us expedite review of your Pull Request if you tag someone (e.g. `@username`) to look at it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 112899) Time Spent: 10m Remaining Estimate: 0h > Beam SQL should cleanly transform graph from Calcite > > > Key: BEAM-4575 > URL: https://issues.apache.org/jira/browse/BEAM-4575 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > It would be nice if the Beam graph matched the Calcite graph in structure > with each node generating a PTransform that is applied onto the PCollection > of it's parent. We should also ensure that each Calcite node only appears in > the Beam graph one time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)