[GitHub] [beam] y1chi opened a new pull request #12096: Change dataflow runner to reify windowed value for batch stateful dof…

2020-06-25 Thread GitBox


y1chi opened a new pull request #12096:
URL: https://github.com/apache/beam/pull/12096


   …n and restore timestamp
   
   This should at least allow the batch stateful dofn to access the original 
timestamp of the windowed value.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https

[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445974593



##
File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamComplexTypeTest.java
##
@@ -452,21 +412,32 @@ public void testNullDatetimeFields() {
 .addNullableField("year_with_null", FieldType.INT64)
 .addField("mm", FieldType.INT64)
 .addNullableField("month_with_null", FieldType.INT64)
-.addField("time_with_hour_added", FieldType.DATETIME)

Review comment:
   Yeah. But it seems nothing wrong with these test cases. I think we could 
just leave it here right now and update them when we implementing TIMESTAMP 
type translation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445971273



##
File path: 
sdks/java/extensions/sql/zetasql/src/test/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSQLDialectSpecTest.java
##
@@ -2457,6 +2458,256 @@ public void testDateFromUnixInt64() {
 
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
   }
 
+  /
+  // TIME type tests
+  /
+
+  @Test
+  public void testTimeLiteral() {
+String sql = "SELECT TIME '15:30:00'";

Review comment:
   TIME literal for millisecond and microsecond precision tests added. 
Nanosecond precision are not supported at this point.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445972137



##
File path: 
sdks/java/extensions/sql/zetasql/src/test/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSQLDialectSpecTest.java
##
@@ -2457,6 +2458,256 @@ public void testDateFromUnixInt64() {
 
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
   }
 
+  /
+  // TIME type tests
+  /
+
+  @Test
+  public void testTimeLiteral() {
+String sql = "SELECT TIME '15:30:00'";
+
+ZetaSQLQueryPlanner zetaSQLQueryPlanner = new ZetaSQLQueryPlanner(config);
+BeamRelNode beamRelNode = zetaSQLQueryPlanner.convertToBeamRel(sql);
+PCollection stream = BeamSqlRelUtils.toPCollection(pipeline, 
beamRelNode);
+
+PAssert.that(stream)
+.containsInAnyOrder(
+Row.withSchema(Schema.builder().addLogicalTypeField("f_time", 
SqlTypes.TIME).build())
+.addValues(LocalTime.of(15, 30, 0, 0))
+.build());
+
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
+  }
+
+  @Test
+  public void testTimeColumn() {
+String sql = "SELECT FORMAT_TIME('%T', time_field) FROM table_with_time";
+
+ZetaSQLQueryPlanner zetaSQLQueryPlanner = new ZetaSQLQueryPlanner(config);
+BeamRelNode beamRelNode = zetaSQLQueryPlanner.convertToBeamRel(sql);
+PCollection stream = BeamSqlRelUtils.toPCollection(pipeline, 
beamRelNode);
+
+PAssert.that(stream)
+.containsInAnyOrder(
+
Row.withSchema(Schema.builder().addStringField("f_time_str").build())
+.addValues("15:30:00")
+.build(),
+
Row.withSchema(Schema.builder().addStringField("f_time_str").build())
+.addValues("23:35:59")
+.build());
+
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
+  }
+
+  // TODO[BEAM-9166]: Add a test for CURRENT_TIME function ("SELECT 
CURRENT_TIME()")
+
+  @Test
+  public void testExtractTime() {
+String sql =
+"SELECT "
++ "EXTRACT(HOUR FROM TIME '15:30:35') as hour, "
++ "EXTRACT(MINUTE FROM TIME '15:30:35') as minute, "
++ "EXTRACT(SECOND FROM TIME '15:30:35') as second, "
++ "EXTRACT(MILLISECOND FROM TIME '15:30:35') as millisecond, "
++ "EXTRACT(MICROSECOND FROM TIME '15:30:35') as microsecond ";

Review comment:
   Extract for millisecond and microsecond precision tests added. 
Nanosecond precision are not supported at this point.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445971273



##
File path: 
sdks/java/extensions/sql/zetasql/src/test/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSQLDialectSpecTest.java
##
@@ -2457,6 +2458,256 @@ public void testDateFromUnixInt64() {
 
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
   }
 
+  /
+  // TIME type tests
+  /
+
+  @Test
+  public void testTimeLiteral() {
+String sql = "SELECT TIME '15:30:00'";

Review comment:
   Tests with sub-second components added, including:
- millisecond
- microsecond
   
   Nanosecond precision are not supported for TIME type at this point.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445971273



##
File path: 
sdks/java/extensions/sql/zetasql/src/test/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSQLDialectSpecTest.java
##
@@ -2457,6 +2458,256 @@ public void testDateFromUnixInt64() {
 
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
   }
 
+  /
+  // TIME type tests
+  /
+
+  @Test
+  public void testTimeLiteral() {
+String sql = "SELECT TIME '15:30:00'";

Review comment:
   Tests with sub-second components added, including:
- Millisecond
- Microsecond
   
   Nanosecond precision are not supported for TIME type at this point.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445971273



##
File path: 
sdks/java/extensions/sql/zetasql/src/test/java/org/apache/beam/sdk/extensions/sql/zetasql/ZetaSQLDialectSpecTest.java
##
@@ -2457,6 +2458,256 @@ public void testDateFromUnixInt64() {
 
pipeline.run().waitUntilFinish(Duration.standardMinutes(PIPELINE_EXECUTION_WAITTIME_MINUTES));
   }
 
+  /
+  // TIME type tests
+  /
+
+  @Test
+  public void testTimeLiteral() {
+String sql = "SELECT TIME '15:30:00'";

Review comment:
   Tests with sub-second components added, including:
- MILLISECOND
- MICROSECOND
   
   NANOSECOND precision are not supported for TIME type at this point.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on pull request #12054:
URL: https://github.com/apache/beam/pull/12054#issuecomment-649971677


   > This 
([robinyqiu@06c49a1](https://github.com/robinyqiu/beam/commit/06c49a15ba36f23cf3dac54669d5d3fa1f5873c2))
 should fix the precision issue with TIME literal. Please patch it to your PR.
   > 
   > The default ZetaSQL analyzer setup supports microsecond precision (if 
wants nanoseconds you need to set an language option). I believe it's enough 
for us to support microsecond precision. Other system like BigQuery also 
support microsecond.
   
   Fixed. Thanks for figuring out the solution.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445965828



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -315,34 +317,38 @@ private static Expression castOutput(Expression value, 
FieldType toType) {
   private static Expression castOutputTime(Expression value, FieldType toType) 
{
 Expression valueDateTime = value;
 
-// First, convert to millis (except for DATE type)
+// Convert TIMESTAMP to joda Instant

Review comment:
   Done. Thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445965109



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -431,11 +437,11 @@ private static Expression value(
 private static Expression value(Expression value, Schema.FieldType type) {
   if (type.getTypeName().isLogicalType()) {
 String logicalId = type.getLogicalType().getIdentifier();
-if (TimeType.IDENTIFIER.equals(logicalId)) {
+if (SqlTypes.TIME.getIdentifier().equals(logicalId)) {
   return nullOr(
-  value, Expressions.convert_(Expressions.call(value, 
"getMillis"), int.class));
+  value, Expressions.divide(value, 
Expressions.constant(NANOS_PER_MILLISECOND)));
 } else if (SqlTypes.DATE.getIdentifier().equals(logicalId)) {
-  value = nullOr(value, value);
+  return nullOr(value, value);

Review comment:
   Yeah. It works if we just return value here. So I have changed it to 
make it simpler.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ZijieSong946 commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


ZijieSong946 commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445965109



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -431,11 +437,11 @@ private static Expression value(
 private static Expression value(Expression value, Schema.FieldType type) {
   if (type.getTypeName().isLogicalType()) {
 String logicalId = type.getLogicalType().getIdentifier();
-if (TimeType.IDENTIFIER.equals(logicalId)) {
+if (SqlTypes.TIME.getIdentifier().equals(logicalId)) {
   return nullOr(
-  value, Expressions.convert_(Expressions.call(value, 
"getMillis"), int.class));
+  value, Expressions.divide(value, 
Expressions.constant(NANOS_PER_MILLISECOND)));
 } else if (SqlTypes.DATE.getIdentifier().equals(logicalId)) {
-  value = nullOr(value, value);
+  return nullOr(value, value);

Review comment:
   Yeah. It works if we just return value here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445942319



##
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##
@@ -46,11 +51,59 @@ class MyDoFn(DoFn):
   def process(self, element: int) -> Iterable[str]:
 pass
 
-print(MyDoFn().get_type_hints())
 th = MyDoFn().get_type_hints()
 self.assertEqual(th.input_types, ((int, ), {}))
 self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+class MyPTransform(PTransform):
+  def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+return pcoll | Map(lambda num: str(num))
+
+th = MyPTransform().get_type_hints()
+self.assertEqual(th.input_types, ((int, ), {}))
+self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+class MyPTransform(PTransform):
+  def expand(self, pcoll: int) -> str:
+return pcoll | Map(lambda num: str(num))
+
+error_str = 'An input typehint to a PTransform must be a single (or 
nested) type wrapped by a PCollection or PBegin. '
+
+with self.assertRaisesRegex(TypeCheckError, error_str):

Review comment:
   Probably because there are unescaped regex control characters in that 
string: `().`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


udim commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-649924273


   > The two functions **strip_pcoll_input** and **strip_pcoll_output_** are 
very similar. Could be refactored into one function. What do you think
   
   Yeah, they do look similar enough to merge. It's your call. If they do get 
much longer though they probably should be merged or using a shared method.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


udim commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445941065



##
File path: sdks/python/apache_beam/typehints/decorators.py
##
@@ -378,6 +379,61 @@ def has_simple_output_type(self):
 self.output_types and len(self.output_types[0]) == 1 and
 not self.output_types[1])
 
+  def strip_pcoll_input(self):
+# type: () -> IOTypeHints
+
+error_str = 'An input typehint to a PTransform must be a single (or 
nested) type wrapped by a PCollection or ' \
+'PBegin. '
+
+if any(element is None for element in [self.input_types, 
self.input_types[0], self.input_types[0][0]]):

Review comment:
   This line will raise an exception when self.input_types is None, 
compared to:
   ```py
   if self.input_types is None or len(self.input_types[0]) != 1:
   ```
   It doesn't take advantage of short-circuit evaluation.
   
   `self.input_types = None` is a valid value. It means nothing was set. (We 
should get rid of that and always have ((), {}) as the input/output_types value 
if there are no type hints, but I haven't had the chance to do that.)
   Same goes for having zero type hints: it's valid and there's nothing to do.
   
   Might be useful to reuse: `self._has_input_types`, 
`self.has_simple_output_type`.
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pulasthi commented on pull request #10888: [BEAM-7304] Twister2 Beam runner

2020-06-25 Thread GitBox


pulasthi commented on pull request #10888:
URL: https://github.com/apache/beam/pull/10888#issuecomment-649920371


   @iemejia Just made the changes, Sorry about the delay, Let me know if 
anything else is needed from my end, I will start work on the document PR's



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pulasthi commented on a change in pull request #10888: [BEAM-7304] Twister2 Beam runner

2020-06-25 Thread GitBox


pulasthi commented on a change in pull request #10888:
URL: https://github.com/apache/beam/pull/10888#discussion_r445938844



##
File path: 
runners/twister2/src/main/java/org/apache/beam/runners/twister2/Twister2LegacyRunner.java
##
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.twister2;
+
+import static 
org.apache.beam.runners.core.construction.resources.PipelineResources.detectClassPathResourcesToStage;
+
+import edu.iu.dsc.tws.api.JobConfig;
+import edu.iu.dsc.tws.api.Twister2Job;
+import edu.iu.dsc.tws.api.config.Config;
+import edu.iu.dsc.tws.api.driver.DriverJobState;
+import edu.iu.dsc.tws.api.exceptions.Twister2RuntimeException;
+import edu.iu.dsc.tws.api.scheduler.Twister2JobState;
+import edu.iu.dsc.tws.api.tset.sets.TSet;
+import edu.iu.dsc.tws.api.tset.sets.batch.BatchTSet;
+import edu.iu.dsc.tws.local.LocalSubmitter;
+import edu.iu.dsc.tws.rsched.core.ResourceAllocator;
+import edu.iu.dsc.tws.rsched.job.Twister2Submitter;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.FileNotFoundException;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.logging.LogManager;
+import java.util.logging.Logger;
+import java.util.stream.Collectors;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipOutputStream;
+import org.apache.beam.runners.core.construction.PTransformMatchers;
+import org.apache.beam.runners.core.construction.PTransformTranslation;
+import org.apache.beam.runners.core.construction.SplittableParDo;
+import org.apache.beam.runners.core.construction.SplittableParDoNaiveBounded;
+import org.apache.beam.runners.core.construction.resources.PipelineResources;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.PipelineResult;
+import org.apache.beam.sdk.PipelineRunner;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsValidator;
+import org.apache.beam.sdk.runners.PTransformOverride;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.MoreObjects;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/**
+ * A {@link PipelineRunner} that executes the operations in the pipeline by 
first translating them
+ * to a Twister2 Plan and then executing them either locally or on a Twister2 
cluster, depending on
+ * the configuration.
+ */
+public class Twister2LegacyRunner extends PipelineRunner {

Review comment:
   Renamed it to Twister2Runner





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] henryken commented on a change in pull request #12066: [BEAM-9679] Add Side Output to Core Transform Go SDK katas

2020-06-25 Thread GitBox


henryken commented on a change in pull request #12066:
URL: https://github.com/apache/beam/pull/12066#discussion_r445934652



##
File path: learning/katas/go/Core Transforms/Side Output/Side 
Output/pkg/task/task.go
##
@@ -0,0 +1,30 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package task
+
+import (
+   "github.com/apache/beam/sdks/go/pkg/beam"
+)
+
+func ApplyTransform(s beam.Scope, input beam.PCollection) (beam.PCollection, 
beam.PCollection) {
+   return beam.ParDo2(s, func(element int, numBelow100, numAbove100 
func(int)) {
+   if element <= 100 {
+   numBelow100(element)
+   return
+   }
+   numAbove100(element)

Review comment:
   While this works too, maybe you would want to consider using `else` 
statement.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #12088: Add output typehints to GroupIntoBatches

2020-06-25 Thread GitBox


udim commented on a change in pull request #12088:
URL: https://github.com/apache/beam/pull/12088#discussion_r445927300



##
File path: sdks/python/apache_beam/transforms/util.py
##
@@ -741,6 +741,7 @@ def WithKeys(pcoll, k):
 
 @experimental()
 @typehints.with_input_types(Tuple[K, V])
+@typehints.with_output_types(Tuple[K, List[V]])

Review comment:
   The Java implementation splits (key, value) pairs.
   
https://github.com/apache/beam/blob/eaa41cc4cbcc4f94d0ec1a36ff2b0f3fcee962f9/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/GroupIntoBatches.java#L173-L174
   
   I don't see that in Python - is the runner supposed to do that?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on a change in pull request #12088: Add output typehints to GroupIntoBatches

2020-06-25 Thread GitBox


udim commented on a change in pull request #12088:
URL: https://github.com/apache/beam/pull/12088#discussion_r445925136



##
File path: sdks/python/apache_beam/transforms/util.py
##
@@ -741,6 +741,7 @@ def WithKeys(pcoll, k):
 
 @experimental()
 @typehints.with_input_types(Tuple[K, V])
+@typehints.with_output_types(Tuple[K, List[V]])

Review comment:
   The test passes because 2 batches/elements are output. And I was running 
the test via pytest as usual.
   
   I don't follow the state api reasoning, and this is not documented in 
GroupIntoBatches. If you get a different result in DirectRunner vs 
DataflowRunner, it's a bug.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12067: [BEAM-10308] Make component ID assignments consistent across PipelineContext instances

2020-06-25 Thread GitBox


TheNeuralBit commented on pull request #12067:
URL: https://github.com/apache/beam/pull/12067#issuecomment-649898941


   I'll see if I can add a simple unit test tomorrow.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #12090: Added schema aware abstraction SchemaIO and implemented for pubsub

2020-06-25 Thread GitBox


TheNeuralBit commented on a change in pull request #12090:
URL: https://github.com/apache/beam/pull/12090#discussion_r445908191



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/pubsub/PubsubJsonTableProvider.java
##
@@ -47,32 +44,40 @@
 @Experimental
 @AutoService(TableProvider.class)
 public class PubsubJsonTableProvider extends InMemoryMetaTableProvider {
+  static final String TIMESTAMP_FIELD = "event_timestamp";
+  static final String ATTRIBUTES_FIELD = "attributes";
+  static final String PAYLOAD_FIELD = "payload";
 
   @Override
   public String getTableType() {
 return "pubsub";
   }
 
   @Override
-  public BeamSqlTable buildBeamSqlTable(Table tableDefintion) {
-JSONObject tableProperties = tableDefintion.getProperties();
+  public BeamSqlTable buildBeamSqlTable(Table tableDefinition) {
+JSONObject tableProperties = tableDefinition.getProperties();
 String timestampAttributeKey = 
tableProperties.getString("timestampAttributeKey");
 String deadLetterQueue = tableProperties.getString("deadLetterQueue");
-validateDlq(deadLetterQueue);
 
-Schema schema = tableDefintion.getSchema();
+Schema schema = tableDefinition.getSchema();
+String location = tableDefinition.getLocation();
+Schema dataSchema = tableDefinition.getSchema();
+
+validateDlq(deadLetterQueue);
 validateEventTimestamp(schema);

Review comment:
   Could you move this validation to the new implementation (probably in 
the `from` method)? We'll want to make sure we still do the validation when 
this is used outside of SQL (e.g. for cross-language).

##
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaIO.java
##
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.pubsub;
+
+import static org.apache.beam.sdk.io.gcp.pubsub.PubsubMessageToRow.DLQ_TAG;
+import static org.apache.beam.sdk.io.gcp.pubsub.PubsubMessageToRow.MAIN_TAG;
+
+import java.io.Serializable;
+import org.apache.beam.sdk.annotations.Internal;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.io.SchemaIO;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionTuple;
+import org.apache.beam.sdk.values.POutput;
+import org.apache.beam.sdk.values.Row;
+
+/** An abstraction to create schema aware IOs. */
+@Internal
+public class PubsubSchemaIO implements SchemaIO, Serializable {

Review comment:
   I think we should make this a `private static` inner class within 
`PubsubSchemaCapableIOProvider` so no one tries to use it on its own.

##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/package-info.java
##
@@ -0,0 +1,20 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/** Provides a new schema aware IO abstraction interface. */

Review comment:
   nit: it won't be new forever :)
   ```suggestion
   /** Provides abstractions for schema-aware IOs. */
   ```

##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/pubsub/PubsubJsonTableProvider.java
##
@@ -47,32 +44,40 @@
 @Experimental
 @AutoService(TableProvider.class)
 public class PubsubJsonTable

[GitHub] [beam] epicfaace commented on pull request #11824: [BEAM-10101] Add HttpIO / HttpFileSystem (Python)

2020-06-25 Thread GitBox


epicfaace commented on pull request #11824:
URL: https://github.com/apache/beam/pull/11824#issuecomment-649895785


   Oh, it appears the test failures are only on Python 2 -- perhaps I didn't 
define the class properly for Python 2



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12085: [BEAM-10318] fix uninitialized grpc_server in FnApiRunner

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12085:
URL: https://github.com/apache/beam/pull/12085#issuecomment-649895565







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12010: [BEAM-10259] Use ref-counted connection to Spanner to prevent multiple connections.

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12010:
URL: https://github.com/apache/beam/pull/12010#issuecomment-649895281


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ajamato commented on pull request #12083: [BEAM-10317] Java - Update BigQueryIO to tag BigQuery Jobs with the Dataflow Job ID

2020-06-25 Thread GitBox


ajamato commented on pull request #12083:
URL: https://github.com/apache/beam/pull/12083#issuecomment-649889368


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ajamato commented on pull request #12084: [BEAM-10317] Python - Update BigQueryIO to tag BigQuery Jobs with the Dataflow Job ID

2020-06-25 Thread GitBox


ajamato commented on pull request #12084:
URL: https://github.com/apache/beam/pull/12084#issuecomment-649888199


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12076:
URL: https://github.com/apache/beam/pull/12076#issuecomment-649888111


   Closing this to avoid confusion.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj closed pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj closed pull request #12076:
URL: https://github.com/apache/beam/pull/12076


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11967: [BEAM-9992] | use Sets transform in BeamSQL

2020-06-25 Thread GitBox


aaltay commented on pull request #11967:
URL: https://github.com/apache/beam/pull/11967#issuecomment-649888075


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12067: [BEAM-10308] Make component ID assignments consistent across PipelineContext instances

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12067:
URL: https://github.com/apache/beam/pull/12067#issuecomment-649888003


   Do you think we can add another unit test that shows how current state will 
result in an invalid proto during cross-language expansion ? That will help us 
understand the scope of the issue.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12076:
URL: https://github.com/apache/beam/pull/12076#issuecomment-649887627


   Ah, sorry about that. Will look at the other one.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] angoenka commented on a change in pull request #12032: [BEAM-10275] Don't eagerly parse pipeline options in sdk_worker_main.py

2020-06-25 Thread GitBox


angoenka commented on a change in pull request #12032:
URL: https://github.com/apache/beam/pull/12032#discussion_r445910583



##
File path: sdks/python/apache_beam/runners/worker/sdk_worker_main_test.py
##
@@ -86,6 +92,20 @@ def test_parse_pipeline_options(self):
  '{"options": {"eam:option:m_option:v":"mock_val"}}').
  get_all_options(drop_default=True))
 
+  @unittest.skip(
+  'BEAM-10274: type=json.loads pipeline options cannot be parsed in the '
+  'python SDK harness')
+  def test_parse_json_object(self):

Review comment:
   Should we enable this tests after the fix.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] youngoli commented on pull request #12095: [BEAM-10311] Sharing restriction trackers between SDF and DataSource.

2020-06-25 Thread GitBox


youngoli commented on pull request #12095:
URL: https://github.com/apache/beam/pull/12095#issuecomment-649885158


   R: @lostluck 
   
   Ended up being a really small PR. I tried to think of a way to test it (like 
testing that the channel gets set in the DataSource maybe), but it seems too 
specific of an interaction to test, and any test for it would be a bit 
complicated.
   
   For context of what I'm intending when it comes to the actual splitting: The 
idea is the DataSource will receive the restriction tracker from the channel 
when it needs to check progress or perform a split, and then return the 
restriction tracker to the channel when done with it. The goal is to make sure 
there's only one copy of that tracker being sent around, so that nothing else 
is using the tracker when processing is done and the tracker is getting deleted.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12090: Added schema aware abstraction SchemaIO and implemented for pubsub

2020-06-25 Thread GitBox


TheNeuralBit commented on pull request #12090:
URL: https://github.com/apache/beam/pull/12090#issuecomment-649884063


   Run SQL PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] youngoli opened a new pull request #12095: [BEAM-10311] Sharing restriction trackers between SDF and DataSource.

2020-06-25 Thread GitBox


youngoli opened a new pull request #12095:
URL: https://github.com/apache/beam/pull/12095


   Adding a channel that can be used to retrieve the current restriction
   tracker from the DataSource.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_Post

[GitHub] [beam] TheNeuralBit commented on pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


TheNeuralBit commented on pull request #12076:
URL: https://github.com/apache/beam/pull/12076#issuecomment-649880575


   @chamikaramj unfortunately I don't think this PR will work. It breaks a lot 
of tests that rely on components getting a specific component id.
   
   Sorry for some reason I broke this up into two PRs, I left a comment to that 
effect in the other one: 
https://github.com/apache/beam/pull/12067#issuecomment-649046980
   
   I don't think something like this will work unless I can scope the id 
cache/assigner to the pipeline.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12082: Standardizing BigQuery job names in Beam Python and Java SDKs

2020-06-25 Thread GitBox


pabloem commented on pull request #12082:
URL: https://github.com/apache/beam/pull/12082#issuecomment-649880288


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


TheNeuralBit commented on a change in pull request #12076:
URL: https://github.com/apache/beam/pull/12076#discussion_r445899639



##
File path: sdks/python/apache_beam/transforms/sql_test.py
##
@@ -157,6 +157,21 @@ def test_zetasql_generate_data(self):
   dialect="zetasql")
   assert_that(out, equal_to([(1, "foo", 3.14)]))
 
+  def test_windowing_before_sql(self):
+with TestPipeline() as p:
+  windowed = (
+  p | beam.Create([
+  SimpleRow(5, "foo", 1.),
+  SimpleRow(15, "bar", 2.),
+  SimpleRow(25, "baz", 3.)
+  ])
+  | beam.Map(lambda v: beam.window.TimestampedValue(v, v.id)).
+  with_output_types(SimpleRow)
+  | beam.WindowInto(
+  beam.window.FixedWindows(10)).with_output_types(SimpleRow)
+  | SqlTransform("SELECT COUNT(*) as `count` FROM PCOLLECTION"))
+  assert_that(out, equal_to([(1, ), (1, ), (1, )]))

Review comment:
   Before the fix it replicates the failure here: 
https://issues.apache.org/jira/browse/BEAM-10143
   
   It fails at execution time because it expects a global window, but receives 
a (correct) interval window





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] darshanj commented on pull request #11967: [BEAM-9992] | use Sets transform in BeamSQL

2020-06-25 Thread GitBox


darshanj commented on pull request #11967:
URL: https://github.com/apache/beam/pull/11967#issuecomment-649873766


   They are not related. Can you please retest this



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12082: Standardizing BigQuery job names in Beam Python and Java SDKs

2020-06-25 Thread GitBox


pabloem commented on pull request #12082:
URL: https://github.com/apache/beam/pull/12082#issuecomment-649873640







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] damgad commented on pull request #12072: Adding Jenkins job cleaning the /tmp directory on Jenkins node machine

2020-06-25 Thread GitBox


damgad commented on pull request #12072:
URL: https://github.com/apache/beam/pull/12072#issuecomment-649872679


   Works like a charm ;) 
https://ci-beam.apache.org/job/beam_Clean_tmp_directory/3/console



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robinyqiu commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


robinyqiu commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445897849



##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -431,11 +437,11 @@ private static Expression value(
 private static Expression value(Expression value, Schema.FieldType type) {
   if (type.getTypeName().isLogicalType()) {
 String logicalId = type.getLogicalType().getIdentifier();
-if (TimeType.IDENTIFIER.equals(logicalId)) {
+if (SqlTypes.TIME.getIdentifier().equals(logicalId)) {
   return nullOr(
-  value, Expressions.convert_(Expressions.call(value, 
"getMillis"), int.class));
+  value, Expressions.divide(value, 
Expressions.constant(NANOS_PER_MILLISECOND)));
 } else if (SqlTypes.DATE.getIdentifier().equals(logicalId)) {
-  value = nullOr(value, value);
+  return nullOr(value, value);

Review comment:
   `nullOr()` also does boxing for `value` if it is not null. Is it 
necessary in this case? I tried to remove the call to `nullOr()` and it works 
fine as well.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robinyqiu commented on pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


robinyqiu commented on pull request #12054:
URL: https://github.com/apache/beam/pull/12054#issuecomment-649871710


   This 
(https://github.com/robinyqiu/beam/commit/06c49a15ba36f23cf3dac54669d5d3fa1f5873c2)
 should fix the precision issue with TIME literal. Please patch it to your PR.
   
   The default ZetaSQL analyzer setup supports microsecond precision (if wants 
nanoseconds you need to set an language option). I believe it's enough for us 
to support microsecond precision. Other system like BigQuery also support 
microsecond.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12076:
URL: https://github.com/apache/beam/pull/12076#issuecomment-649870795


   cc: @robertwb 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj commented on pull request #12076:
URL: https://github.com/apache/beam/pull/12076#issuecomment-649870730


   Also, please make sure that existing x-lang tests pass for this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj commented on a change in pull request #12076:
URL: https://github.com/apache/beam/pull/12076#discussion_r445895416



##
File path: sdks/python/apache_beam/transforms/sql_test.py
##
@@ -157,6 +157,21 @@ def test_zetasql_generate_data(self):
   dialect="zetasql")
   assert_that(out, equal_to([(1, "foo", 3.14)]))
 
+  def test_windowing_before_sql(self):

Review comment:
   Do you think we could come up with a test where PTransforms get assigned 
to wrong IDs during pipeline expansion (resulting in an invalid pipeline proto) 
?

##
File path: sdks/python/apache_beam/transforms/sql_test.py
##
@@ -157,6 +157,21 @@ def test_zetasql_generate_data(self):
   dialect="zetasql")
   assert_that(out, equal_to([(1, "foo", 3.14)]))
 
+  def test_windowing_before_sql(self):
+with TestPipeline() as p:
+  windowed = (
+  p | beam.Create([
+  SimpleRow(5, "foo", 1.),
+  SimpleRow(15, "bar", 2.),
+  SimpleRow(25, "baz", 3.)
+  ])
+  | beam.Map(lambda v: beam.window.TimestampedValue(v, v.id)).
+  with_output_types(SimpleRow)
+  | beam.WindowInto(
+  beam.window.FixedWindows(10)).with_output_types(SimpleRow)
+  | SqlTransform("SELECT COUNT(*) as `count` FROM PCOLLECTION"))
+  assert_that(out, equal_to([(1, ), (1, ), (1, )]))

Review comment:
   Just curios, Is this failing before the fix or does it pass with a wrong 
pipeline ? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on a change in pull request #12088: Add output typehints to GroupIntoBatches

2020-06-25 Thread GitBox


aaltay commented on a change in pull request #12088:
URL: https://github.com/apache/beam/pull/12088#discussion_r445895986



##
File path: sdks/python/apache_beam/transforms/util.py
##
@@ -741,6 +741,7 @@ def WithKeys(pcoll, k):
 
 @experimental()
 @typehints.with_input_types(Tuple[K, V])
+@typehints.with_output_types(Tuple[K, List[V]])

Review comment:
   Thank you. This might be a bug with whatever runner Udi used (one of the 
direct runners). I am still puzzled by what would be a good implementation of 
this case in batch. Would a naive implementation be always insert a shuffle, 
and use an memory backed state store for the batch case?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #12076: [BEAM-10143] Add test verifying external windowing

2020-06-25 Thread GitBox


chamikaramj commented on a change in pull request #12076:
URL: https://github.com/apache/beam/pull/12076#discussion_r445894984



##
File path: sdks/python/apache_beam/runners/pipeline_context_test.py
##
@@ -53,6 +53,18 @@ def test_serialization(self):
 self.assertEqual(
 coders.BytesCoder(), context2.coders.get_by_id(bytes_coder_ref))
 
+  def test_ref_assignment_spans_multiple_instances(self):
+context = pipeline_context.PipelineContext()
+float_coder_ref = context.coders.get_id(coders.FloatCoder())
+bytes_coder_ref = context.coders.get_id(coders.BytesCoder())
+context2 = pipeline_context.PipelineContext()
+

Review comment:
   Nit: remove extra line





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kennknowles commented on a change in pull request #12088: Add output typehints to GroupIntoBatches

2020-06-25 Thread GitBox


kennknowles commented on a change in pull request #12088:
URL: https://github.com/apache/beam/pull/12088#discussion_r445894269



##
File path: sdks/python/apache_beam/transforms/util.py
##
@@ -741,6 +741,7 @@ def WithKeys(pcoll, k):
 
 @experimental()
 @typehints.with_input_types(Tuple[K, V])
+@typehints.with_output_types(Tuple[K, List[V]])

Review comment:
   A very real situation is that it might already be partitioned however 
the runner wants, via analyzing the pipeline.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kennknowles commented on a change in pull request #12088: Add output typehints to GroupIntoBatches

2020-06-25 Thread GitBox


kennknowles commented on a change in pull request #12088:
URL: https://github.com/apache/beam/pull/12088#discussion_r445894084



##
File path: sdks/python/apache_beam/transforms/util.py
##
@@ -741,6 +741,7 @@ def WithKeys(pcoll, k):
 
 @experimental()
 @typehints.with_input_types(Tuple[K, V])
+@typehints.with_output_types(Tuple[K, List[V]])

Review comment:
   Correct. It is up to the runner how to implement, but per key and window 
must be executed serially.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11824: [BEAM-10101] Add HttpIO / HttpFileSystem (Python)

2020-06-25 Thread GitBox


pabloem commented on pull request #11824:
URL: https://github.com/apache/beam/pull/11824#issuecomment-649867747


   okay let me figure out why that's happening...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] damondouglas commented on pull request #12066: [BEAM-9679] Add Side Output to Core Transform Go SDK katas

2020-06-25 Thread GitBox


damondouglas commented on pull request #12066:
URL: https://github.com/apache/beam/pull/12066#issuecomment-649865228


   No problem at all.  Please take your time and thank you.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12083: [BEAM-10317] Java - Update BigQueryIO to tag BigQuery Jobs with the Dataflow Job ID

2020-06-25 Thread GitBox


pabloem commented on pull request #12083:
URL: https://github.com/apache/beam/pull/12083#issuecomment-649864733


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #12093: [BEAM-10303] Add support for the non-window observing optimization to DoFn execution.

2020-06-25 Thread GitBox


lukecwik commented on pull request #12093:
URL: https://github.com/apache/beam/pull/12093#issuecomment-649863134


   CC: @reuvenlax 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #12093: [BEAM-10303] Add support for the non-window observing optimization to DoFn execution.

2020-06-25 Thread GitBox


lukecwik commented on pull request #12093:
URL: https://github.com/apache/beam/pull/12093#issuecomment-649862731







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #10900: [BEAM-9335] update hard-coded coder id when translating Java external transforms

2020-06-25 Thread GitBox


TheNeuralBit commented on a change in pull request #10900:
URL: https://github.com/apache/beam/pull/10900#discussion_r445880888



##
File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/External.java
##
@@ -222,6 +226,20 @@ public OutputT expand(InputT input) {
   });
   externalPCollectionIdMap = externalPCollectionIdMapBuilder.build();
 
+  Map externalCoderIdMapBuilder = new HashMap<>();
+  expandedComponents
+  .getPcollectionsMap()
+  .forEach(
+  (pcolId, pCol) -> {
+try {
+  Coder coder = 
rehydratedComponents.getCoder(pCol.getCoderId());
+  externalCoderIdMapBuilder.putIfAbsent(coder, 
pCol.getCoderId());
+} catch (IOException e) {
+  throw new RuntimeException("cannot rehydrate Coder.");

Review comment:
   Won't it often be the case that we can't rehydrate the coder, because 
it's specific to another SDK?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] damgad closed pull request #12094: Jenkins rocks

2020-06-25 Thread GitBox


damgad closed pull request #12094:
URL: https://github.com/apache/beam/pull/12094


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] damgad opened a new pull request #12094: Jenkins rocks

2020-06-25 Thread GitBox


damgad opened a new pull request #12094:
URL: https://github.com/apache/beam/pull/12094


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/ic

[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445877050



##
File path: sdks/python/apache_beam/typehints/decorators.py
##
@@ -378,6 +379,61 @@ def has_simple_output_type(self):
 self.output_types and len(self.output_types[0]) == 1 and
 not self.output_types[1])
 
+  def strip_pcoll_input(self):
+# type: () -> IOTypeHints
+
+error_str = 'An input typehint to a PTransform must be a single (or 
nested) type wrapped by a PCollection or ' \
+'PBegin. '
+
+if any(element is None for element in [self.input_types, 
self.input_types[0], self.input_types[0][0]]):

Review comment:
   Now that I think about it this can be a simple try/catch IndexError 
instead of manually checking for None





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] angoenka merged pull request #11993: Change time granularity to seconds in ParDoTest TestStream timer test…

2020-06-25 Thread GitBox


angoenka merged pull request #11993:
URL: https://github.com/apache/beam/pull/11993


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #12093: [BEAM-10303] Add support for the non-window observing optimization to DoFn execution.

2020-06-25 Thread GitBox


lukecwik commented on pull request #12093:
URL: https://github.com/apache/beam/pull/12093#issuecomment-649850539







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445871780



##
File path: sdks/python/apache_beam/typehints/typehints_test_py3.py
##
@@ -46,11 +51,59 @@ class MyDoFn(DoFn):
   def process(self, element: int) -> Iterable[str]:
 pass
 
-print(MyDoFn().get_type_hints())
 th = MyDoFn().get_type_hints()
 self.assertEqual(th.input_types, ((int, ), {}))
 self.assertEqual(th.output_types, ((str, ), {}))
 
 
+class TestPTransformAnnotations(unittest.TestCase):
+  def test_pep484_annotations(self):
+class MyPTransform(PTransform):
+  def expand(self, pcoll: PCollection[int]) -> PCollection[str]:
+return pcoll | Map(lambda num: str(num))
+
+th = MyPTransform().get_type_hints()
+self.assertEqual(th.input_types, ((int, ), {}))
+self.assertEqual(th.output_types, ((str, ), {}))
+
+  def test_annotations_without_pcollection_wrapper(self):
+class MyPTransform(PTransform):
+  def expand(self, pcoll: int) -> str:
+return pcoll | Map(lambda num: str(num))
+
+error_str = 'An input typehint to a PTransform must be a single (or 
nested) type wrapped by a PCollection or PBegin. '
+
+with self.assertRaisesRegex(TypeCheckError, error_str):

Review comment:
   Any idea why this test is failing? It says the strings don't match in 
stdout but they appear to match.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik opened a new pull request #12093: [BEAM-10303] Add support for the non-window observing optimization to DoFn execution.

2020-06-25 Thread GitBox


lukecwik opened a new pull request #12093:
URL: https://github.com/apache/beam/pull/12093


   This covers all but the splittable DoFn processElements call since I wanted 
to limit the size of the change.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/b

[GitHub] [beam] lukecwik commented on pull request #12093: [BEAM-10303] Add support for the non-window observing optimization to DoFn execution.

2020-06-25 Thread GitBox


lukecwik commented on pull request #12093:
URL: https://github.com/apache/beam/pull/12093#issuecomment-649849369


   R: @ibzib 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] saavan-google-intern commented on pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


saavan-google-intern commented on pull request #12009:
URL: https://github.com/apache/beam/pull/12009#issuecomment-649848772


   The two functions **strip_pcoll_input** and **strip_pcoll_output_** are very 
similar. Could be refactored into one function. What do you think



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] damgad commented on pull request #12072: Adding Jenkins job cleaning the /tmp directory on Jenkins node machine

2020-06-25 Thread GitBox


damgad commented on pull request #12072:
URL: https://github.com/apache/beam/pull/12072#issuecomment-649844629


   Sure, I will :+1: 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay merged pull request #12072: Adding Jenkins job cleaning the /tmp directory on Jenkins node machine

2020-06-25 Thread GitBox


aaltay merged pull request #12072:
URL: https://github.com/apache/beam/pull/12072


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #12072: Adding Jenkins job cleaning the /tmp directory on Jenkins node machine

2020-06-25 Thread GitBox


aaltay commented on pull request #12072:
URL: https://github.com/apache/beam/pull/12072#issuecomment-649844411


   Let's merge. Could you test it once it is merged?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] damgad commented on pull request #12072: Adding Jenkins job cleaning the /tmp directory on Jenkins node machine

2020-06-25 Thread GitBox


damgad commented on pull request #12072:
URL: https://github.com/apache/beam/pull/12072#issuecomment-649843232


   @aaltay The cron version (already deployed on the nodes, today) is less 
strict. It executes once a week and deletes files that were not accessed during 
at least two days. That should be fine for most cases, and safer for the 
executing builds.
   
   Here, the mechanism is more powerful, on-demand, and configurable. By 
specifying the `unaccessed_for` to `0` you can even delete all the tmp files. 
   
   I've also considered having only that job with a cron trigger along with the 
manual one. But that's less reliable. In that case, if the node was for some 
reason overloaded or builds were stuck the cleanup wouldn't happen.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia opened a new pull request #12092: [DO NOT MERGE] Prototype TPCDS Benchmark for BeamSQL

2020-06-25 Thread GitBox


amaliujia opened a new pull request #12092:
URL: https://github.com/apache/beam/pull/12092


   **Please** add a meaningful description for your change here
   
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/b

[GitHub] [beam] saavan-google-intern commented on a change in pull request #12009: [BEAM-10258] Support type hint annotations on PTransform's expand()

2020-06-25 Thread GitBox


saavan-google-intern commented on a change in pull request #12009:
URL: https://github.com/apache/beam/pull/12009#discussion_r445866150



##
File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
##
@@ -40,6 +40,14 @@ def process(self, element: int) -> typehints.Tuple[str]:
 with self.assertRaisesRegex(typehints.TypeCheckError,
 r'requires.*int.*got.*str'):
   _ = ['a', 'b', 'c'] | beam.ParDo(MyDoFn())
+  def test_pardo_dofn(self):

Review comment:
   That was an accident, thanks





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #12073: [BEAM-9543] Support Match Recognition in Beam SQL

2020-06-25 Thread GitBox


aaltay commented on pull request #12073:
URL: https://github.com/apache/beam/pull/12073#issuecomment-649841664


   > cc @aaltay
   
   Nice, thank you for cc'ing me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] y1chi commented on pull request #11993: Change time granularity to seconds in ParDoTest TestStream timer test…

2020-06-25 Thread GitBox


y1chi commented on pull request #11993:
URL: https://github.com/apache/beam/pull/11993#issuecomment-649823298


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] apilloud commented on a change in pull request #12054: [BEAM-10219] Support ZetaSQL TIME functions in BeamSQL

2020-06-25 Thread GitBox


apilloud commented on a change in pull request #12054:
URL: https://github.com/apache/beam/pull/12054#discussion_r445832856



##
File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamComplexTypeTest.java
##
@@ -452,21 +412,32 @@ public void testNullDatetimeFields() {
 .addNullableField("year_with_null", FieldType.INT64)
 .addField("mm", FieldType.INT64)
 .addNullableField("month_with_null", FieldType.INT64)
-.addField("time_with_hour_added", FieldType.DATETIME)

Review comment:
   If I remember right, `FieldType.DATETIME` is the SQL type of 
`TIMESTAMP`. Looks like all the test cases for `TIMESTAMP` have been dropped?

##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -315,34 +317,38 @@ private static Expression castOutput(Expression value, 
FieldType toType) {
   private static Expression castOutputTime(Expression value, FieldType toType) 
{
 Expression valueDateTime = value;
 
-// First, convert to millis (except for DATE type)
+// Convert TIMESTAMP to joda Instant

Review comment:
   nit: Can you move each of these comments into the appropriate if block?

##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -315,34 +317,38 @@ private static Expression castOutput(Expression value, 
FieldType toType) {
   private static Expression castOutputTime(Expression value, FieldType toType) 
{
 Expression valueDateTime = value;
 
-// First, convert to millis (except for DATE type)
+// Convert TIMESTAMP to joda Instant
+// Convert DATE to LocalDate
+// Convert TIME to LocalTime
 if (CalciteUtils.TIMESTAMP.typesEqual(toType)
 || CalciteUtils.NULLABLE_TIMESTAMP.typesEqual(toType)) {
   if (value.getType() == java.sql.Timestamp.class) {
 valueDateTime = 
Expressions.call(BuiltInMethod.TIMESTAMP_TO_LONG.method, valueDateTime);
   }
+  valueDateTime = Expressions.new_(Instant.class, valueDateTime);
 } else if (CalciteUtils.TIME.typesEqual(toType)
 || CalciteUtils.NULLABLE_TIME.typesEqual(toType)) {
   if (value.getType() == java.sql.Time.class) {
 valueDateTime = Expressions.call(BuiltInMethod.TIME_TO_INT.method, 
valueDateTime);
+  } else if (value.getType() == Long.class) {

Review comment:
   nit: There is no test case for this. 

##
File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
##
@@ -431,11 +437,11 @@ private static Expression value(
 private static Expression value(Expression value, Schema.FieldType type) {
   if (type.getTypeName().isLogicalType()) {
 String logicalId = type.getLogicalType().getIdentifier();
-if (TimeType.IDENTIFIER.equals(logicalId)) {
+if (SqlTypes.TIME.getIdentifier().equals(logicalId)) {
   return nullOr(
-  value, Expressions.convert_(Expressions.call(value, 
"getMillis"), int.class));
+  value, Expressions.divide(value, 
Expressions.constant(NANOS_PER_MILLISECOND)));
 } else if (SqlTypes.DATE.getIdentifier().equals(logicalId)) {
-  value = nullOr(value, value);
+  return nullOr(value, value);

Review comment:
   This is going to generate the code `value == null ? null : value`, which 
does nothing. drop it all together?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] y1chi commented on a change in pull request #11993: Change time granularity to seconds in ParDoTest TestStream timer test…

2020-06-25 Thread GitBox


y1chi commented on a change in pull request #11993:
URL: https://github.com/apache/beam/pull/11993#discussion_r445836802



##
File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
##
@@ -3817,24 +3817,15 @@ public void onTimer(
   TestStream> stream =
   TestStream.create(KvCoder.of(StringUtf8Coder.of(), VarIntCoder.of()))
   // See GlobalWindow,
-  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)),
-  // minus 775 milliseconds to make the grainularity to second as 
required by dataflow
-  // for TestStream.
-  .advanceWatermarkTo(
-  BoundedWindow.TIMESTAMP_MAX_VALUE
-  .minus(Duration.standardDays(1))
-  .minus(BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis() % 
1000))
+  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1))
+  
.advanceWatermarkTo(BoundedWindow.TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)))

Review comment:
   done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] epicfaace commented on pull request #11824: [BEAM-10101] Add HttpIO / HttpFileSystem (Python)

2020-06-25 Thread GitBox


epicfaace commented on pull request #11824:
URL: https://github.com/apache/beam/pull/11824#issuecomment-649815907


   I'm not sure why exactly those failures are happening, since `HttpIO` _does_ 
take `client` as a constructor.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] epicfaace edited a comment on pull request #11824: [BEAM-10101] Add HttpIO / HttpFileSystem (Python)

2020-06-25 Thread GitBox


epicfaace edited a comment on pull request #11824:
URL: https://github.com/apache/beam/pull/11824#issuecomment-649815907


   I'm not sure why exactly those failures are happening, since `HttpIO` _does_ 
take `client` in its constructor.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] amaliujia commented on pull request #12073: [BEAM-9543] Support Match Recognition in Beam SQL

2020-06-25 Thread GitBox


amaliujia commented on pull request #12073:
URL: https://github.com/apache/beam/pull/12073#issuecomment-649815316


   cc @aaltay 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] epicfaace commented on a change in pull request #11824: [BEAM-10101] Add HttpIO / HttpFileSystem (Python)

2020-06-25 Thread GitBox


epicfaace commented on a change in pull request #11824:
URL: https://github.com/apache/beam/pull/11824#discussion_r445836086



##
File path: sdks/python/apache_beam/io/httpio.py
##
@@ -0,0 +1,172 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""This class implements methods to interact with files at HTTP URLs.
+
+This I/O only implements methods to read with files at HTTP URLs, because
+of the variability in methods by which HTTP content can be written
+to a server. If you need to write your results to an HTTP endpoint,
+you might want to make your own I/O or use another, more specific,
+I/O connector.
+
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import io
+from builtins import object
+
+from apache_beam.io.filesystem import BeamIOError
+from apache_beam.io.filesystemio import Downloader
+from apache_beam.io.filesystemio import DownloaderStream
+from apache_beam.internal.http_client import get_new_http
+import sys
+from httplib2 import HttpLib2Error
+
+REQUEST_FAILED_ERROR_MSG = "HTTP request failed for URL {}: {}"
+UNEXPECTED_STATUS_CODE_ERROR_MSG = "Unexpected status code received for URL 
{}: {} {}"
+
+
+class HttpIO(object):

Review comment:
   ```suggestion
   class HttpIO:
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] Ardagan commented on pull request #12091: dummy change

2020-06-25 Thread GitBox


Ardagan commented on pull request #12091:
URL: https://github.com/apache/beam/pull/12091#issuecomment-649812079


   Run PythonFormatter PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] Ardagan removed a comment on pull request #12091: dummy change

2020-06-25 Thread GitBox


Ardagan removed a comment on pull request #12091:
URL: https://github.com/apache/beam/pull/12091#issuecomment-649810198


   Run PythonFormatter PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] Ardagan commented on pull request #12091: dummy change

2020-06-25 Thread GitBox


Ardagan commented on pull request #12091:
URL: https://github.com/apache/beam/pull/12091#issuecomment-649810198


   Run PythonFormatter PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on a change in pull request #11993: Change time granularity to seconds in ParDoTest TestStream timer test…

2020-06-25 Thread GitBox


lukecwik commented on a change in pull request #11993:
URL: https://github.com/apache/beam/pull/11993#discussion_r445826286



##
File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
##
@@ -3817,24 +3817,15 @@ public void onTimer(
   TestStream> stream =
   TestStream.create(KvCoder.of(StringUtf8Coder.of(), VarIntCoder.of()))
   // See GlobalWindow,
-  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)),
-  // minus 775 milliseconds to make the grainularity to second as 
required by dataflow
-  // for TestStream.
-  .advanceWatermarkTo(
-  BoundedWindow.TIMESTAMP_MAX_VALUE
-  .minus(Duration.standardDays(1))
-  .minus(BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis() % 
1000))
+  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1))
+  
.advanceWatermarkTo(BoundedWindow.TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)))

Review comment:
   nit: use GlobalWindow.INSTANCE.maxTimestamp():
   
https://github.com/apache/beam/blob/7269e3f8e2b7fb185347f8fcb80f0cd5afcbaeb6/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/GlobalWindow.java#L42
   
   Here and below instead of calculating it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] Ardagan opened a new pull request #12091: dummy change

2020-06-25 Thread GitBox


Ardagan opened a new pull request #12091:
URL: https://github.com/apache/beam/pull/12091


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark
   --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/i

[GitHub] [beam] lukecwik commented on a change in pull request #11993: Change time granularity to seconds in ParDoTest TestStream timer test…

2020-06-25 Thread GitBox


lukecwik commented on a change in pull request #11993:
URL: https://github.com/apache/beam/pull/11993#discussion_r445826286



##
File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
##
@@ -3817,24 +3817,15 @@ public void onTimer(
   TestStream> stream =
   TestStream.create(KvCoder.of(StringUtf8Coder.of(), VarIntCoder.of()))
   // See GlobalWindow,
-  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)),
-  // minus 775 milliseconds to make the grainularity to second as 
required by dataflow
-  // for TestStream.
-  .advanceWatermarkTo(
-  BoundedWindow.TIMESTAMP_MAX_VALUE
-  .minus(Duration.standardDays(1))
-  .minus(BoundedWindow.TIMESTAMP_MAX_VALUE.getMillis() % 
1000))
+  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1))
+  
.advanceWatermarkTo(BoundedWindow.TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)))

Review comment:
   nit: use GlobalWindow.INSTANCE.maxTimestamp():
   
https://github.com/apache/beam/blob/7269e3f8e2b7fb185347f8fcb80f0cd5afcbaeb6/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/GlobalWindow.java#L42





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11850: [BEAM-1438] Allow 0 shards on WriteFiles streaming

2020-06-25 Thread GitBox


chamikaramj commented on pull request #11850:
URL: https://github.com/apache/beam/pull/11850#issuecomment-649804337


   Looks like Reuven's last comment in the JIRA is from 2017 :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #11850: [BEAM-1438] Allow 0 shards on WriteFiles streaming

2020-06-25 Thread GitBox


chamikaramj commented on pull request #11850:
URL: https://github.com/apache/beam/pull/11850#issuecomment-649803784


   I think we should at least change some of the benchmarks to not set this 
flag to determine if there's a performance impact.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] saavan-google-intern commented on pull request #11939: [BEAM-10197] Support typehints for Python's frozenset

2020-06-25 Thread GitBox


saavan-google-intern commented on pull request #11939:
URL: https://github.com/apache/beam/pull/11939#issuecomment-649803985


   It's going good! 
   
   I think there are some internal google3 tests failing because frozenset was 
used incorrectly as a typehint in those tests but @udim is super nice and is 
fixing them since I don't have access, and it should be ready for merge after 
that
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem merged pull request #11929: [BEAM-10201] Add deadletter support to JsonToRow

2020-06-25 Thread GitBox


pabloem merged pull request #11929:
URL: https://github.com/apache/beam/pull/11929


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12087: [BEAM-10321] retain environments in flatten for preventing it from be…

2020-06-25 Thread GitBox


pabloem commented on pull request #12087:
URL: https://github.com/apache/beam/pull/12087#issuecomment-649801292


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11765: [BEAM-9322] Remove passthrough_pcollection_output_ids and force_generated_pcollection_output_ids flags

2020-06-25 Thread GitBox


pabloem commented on pull request #11765:
URL: https://github.com/apache/beam/pull/11765#issuecomment-649800364


   hm I don't know why this triggered all tests...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11765: [BEAM-9322] Remove passthrough_pcollection_output_ids and force_generated_pcollection_output_ids flags

2020-06-25 Thread GitBox


pabloem commented on pull request #11765:
URL: https://github.com/apache/beam/pull/11765#issuecomment-649800068


   Run Python 3.7 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on a change in pull request #11993: Change time granularity to seconds in ParDoTest TestStream timer test…

2020-06-25 Thread GitBox


lukecwik commented on a change in pull request #11993:
URL: https://github.com/apache/beam/pull/11993#discussion_r445817767



##
File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java
##
@@ -3808,15 +3817,24 @@ public void onTimer(
   TestStream> stream =
   TestStream.create(KvCoder.of(StringUtf8Coder.of(), VarIntCoder.of()))
   // See GlobalWindow,
-  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1))
-  
.advanceWatermarkTo(BoundedWindow.TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)))
+  // END_OF_GLOBAL_WINDOW is 
TIMESTAMP_MAX_VALUE.minus(Duration.standardDays(1)),

Review comment:
   Lets leave this one tests as is and remove it from the change. We can 
update the others with the changes since they won't change the semantics of 
what is being tested.
   
   For Dataflow's TestStream implementation, to simulate production, it should 
round all the advance timestampts to the closest `>=` internal timestamp. It 
should also make sure that no two advance timestamps that aren't the same are 
not rounded to the same value. I believe this would make this test work and it 
would simulate what is happening in production.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11850: [BEAM-1438] Allow 0 shards on WriteFiles streaming

2020-06-25 Thread GitBox


pabloem commented on pull request #11850:
URL: https://github.com/apache/beam/pull/11850#issuecomment-649797891


   I think this was already reviewed and discussed by Reuven in BEAM-1438 - the 
only issue was that the *gate* was not removed from the feature.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12084: [BEAM-10317] Python - Update BigQueryIO to tag BigQuery Jobs with the Dataflow Job ID

2020-06-25 Thread GitBox


pabloem commented on pull request #12084:
URL: https://github.com/apache/beam/pull/12084#issuecomment-649793787


   can you address formatter / lint / rat / test  errors?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #11824: [BEAM-10101] Add HttpIO / HttpFileSystem (Python)

2020-06-25 Thread GitBox


pabloem commented on pull request #11824:
URL: https://github.com/apache/beam/pull/11824#issuecomment-649793262


   sorry about the delay - it looks like many of the tests are failing: 
https://ci-beam.apache.org/job/beam_PreCommit_Python_Phrase/1920/
   Do you htink you could fix those so we can review?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #10187: [BEAM-8376] Initial version of firestore connector JavaSDK

2020-06-25 Thread GitBox


aaltay commented on pull request #10187:
URL: https://github.com/apache/beam/pull/10187#issuecomment-649792398


   What is the state of this PR? Do you need help?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11967: [BEAM-9992] | use Sets transform in BeamSQL

2020-06-25 Thread GitBox


aaltay commented on pull request #11967:
URL: https://github.com/apache/beam/pull/11967#issuecomment-649791917


   Are the test errors related to this PR, or are they flakes?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #10487: [BEAM-5504] Introduce PubsubAvroTable

2020-06-25 Thread GitBox


aaltay commented on pull request #10487:
URL: https://github.com/apache/beam/pull/10487#issuecomment-649791695


   @milantracy - How is this PR going?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #11939: [BEAM-10197] Support typehints for Python's frozenset

2020-06-25 Thread GitBox


aaltay commented on pull request #11939:
URL: https://github.com/apache/beam/pull/11939#issuecomment-649791420


   How is this PR going?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   >