date:20190711

[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=275681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275681
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:58
Start Date: 12/Jul/19 06:58
Worklog Time Spent: 10m 
  Work Description: ttanay commented on pull request #9044: [BEAM-7437] 
Raise RuntimeError for PY2 in BigqueryFullResultStreamingMatcher
URL: https://github.com/apache/beam/pull/9044#discussion_r302850735
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/tests/bigquery_matcher.py
 ##
 @@ -192,7 +193,10 @@ def _get_query_result(self, bigquery_client):
   if len(response) >= len(self.expected_data):
 return response
   time.sleep(1)
-raise TimeoutError('Timeout exceeded for matcher.')
+if six.PY3:
 
 Review comment:
   Thanks @udim.
   I'll make the change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275681)
Time Spent: 6h 50m  (was: 6h 40m)

> Integration Test for BQ streaming inserts for streaming pipelines
> -
>
> Key: BEAM-7437
> URL: https://issues.apache.org/jira/browse/BEAM-7437
> Project: Beam
>  Issue Type: Test
>  Components: io-python-gcp
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Labels: test
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Integration Test for BigQuery Sink using Streaming Inserts for streaming 
> pipelines.
> Integration tests currently exist for batch pipelines, it can also be added 
> for streaming pipelines using TestStream. This will be a precursor to the 
> failing integration test to be added for [BEAM-6611| 
> https://issues.apache.org/jira/browse/BEAM-6611].



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7079) Run Chicago Taxi Example on Dataflow

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7079?focusedWorklogId=275680&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275680
 ]

ASF GitHub Bot logged work on BEAM-7079:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:56
Start Date: 12/Jul/19 06:56
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #8939: [BEAM-7079] Add 
Chicago Taxi Example running on Dataflow
URL: https://github.com/apache/beam/pull/8939#issuecomment-510771008
 
 
   @angoenka Hi, could you take a look at this?
   CC: @pabloem 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275680)
Time Spent: 19h 40m  (was: 19.5h)

> Run Chicago Taxi Example on Dataflow
> 
>
> Key: BEAM-7079
> URL: https://issues.apache.org/jira/browse/BEAM-7079
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-6675) The JdbcIO sink should accept schemas

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-6675?focusedWorklogId=275674&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275674
 ]

ASF GitHub Bot logged work on BEAM-6675:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:42
Start Date: 12/Jul/19 06:42
Worklog Time Spent: 10m 
  Work Description: sehrish-vd commented on issue #8962: [BEAM-6675] 
Generate JDBC statement and preparedStatementSetter automatically when schema 
is available
URL: https://github.com/apache/beam/pull/8962#issuecomment-510767335
 
 
   @amaliujia can you please review the comments again?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275674)
Time Spent: 5.5h  (was: 5h 20m)

> The JdbcIO sink should accept schemas
> -
>
> Key: BEAM-6675
> URL: https://issues.apache.org/jira/browse/BEAM-6675
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-jdbc
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> If the input has a schema, there should be a default mapping to a 
> PreparedStatement for writing based on that schema.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275673&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275673
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:40
Start Date: 12/Jul/19 06:40
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#issuecomment-510766791
 
 
   I am not seeing Java precommit tests are triggered. I am less familiar on 
how does that is enabled. Guessing have to set at here: 
https://github.com/apache/beam/blob/master/build.gradle#L132  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275673)
Time Spent: 2h  (was: 1h 50m)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275663
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:31
Start Date: 12/Jul/19 06:31
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#discussion_r302838285
 
 

 ##
 File path: 
sdks/java/extensions/timeseries/src/main/java/org/apache/beam/sdk/extensions/timeseries/joins/BiStreamJoinResultCoder.java
 ##
 @@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.timeseries.joins;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.Arrays;
+import java.util.List;
+import org.apache.beam.sdk.coders.*;
+import org.apache.beam.sdk.values.TimestampedValue;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.sdk.values.TypeParameter;
+
+public class BiStreamJoinResultCoder
+extends StructuredCoder> {
+
+  private final Coder leftCoder;
+  private final Coder rightCoder;
+  private final Coder keyCoder;
+
+  public BiStreamJoinResultCoder(Coder keyCoder, Coder leftCoder, 
Coder rightCoder) {
+this.leftCoder = NullableCoder.of(leftCoder);
+this.rightCoder = NullableCoder.of(rightCoder);
+this.keyCoder = NullableCoder.of(keyCoder);
+  }
+
+  public Coder getLeftCoder() {
+return leftCoder;
+  }
+
+  public Coder getRightCoder() {
+return rightCoder;
+  }
+
+  public Coder getKeyCoder() {
+return keyCoder;
+  }
+
+  public static  BiStreamJoinResultCoder of(
+  Coder keyCoder, Coder leftCoder, Coder rightCoder) {
+
+return new BiStreamJoinResultCoder<>(keyCoder, leftCoder, rightCoder);
+  }
+
+  @Override
+  public void encode(BiTemporalJoinResult value, OutputStream 
outStream)
+  throws IOException {
+
+if (value == null) {
 
 Review comment:
   key, left and right coders are already `NullableCoder` which tolerates 
`NULL`? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275663)
Time Spent: 1.5h  (was: 1h 20m)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275667&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275667
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:31
Start Date: 12/Jul/19 06:31
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#issuecomment-510764741
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275667)
Time Spent: 1h 50m  (was: 1h 40m)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275665&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275665
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:31
Start Date: 12/Jul/19 06:31
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#discussion_r302843754
 
 

 ##
 File path: 
sdks/java/extensions/timeseries/src/test/java/org/apache/beam/sdk/extensions/timeseries/joins/BiTemporalCacheTestIT.java
 ##
 @@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.timeseries.joins;
+
+import java.io.Serializable;
+import 
org.apache.beam.sdk.extensions.timeseries.joins.BiTemporalTestUtils.QuoteData;
+import 
org.apache.beam.sdk.extensions.timeseries.joins.BiTemporalTestUtils.TradeData;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.Sum;
+import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class BiTemporalCacheTestIT implements Serializable {
 
 Review comment:
   How does this IT run? Will `./gradlew test` trigger it?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275665)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275666&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275666
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:31
Start Date: 12/Jul/19 06:31
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#discussion_r302798339
 
 

 ##
 File path: sdks/java/extensions/timeseries/build.gradle
 ##
 @@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature()
+
+description = "Apache Beam :: SDKs :: Java :: Extensions :: Timeseries"
+ext.summary = """Beam TIMESERIES provides helper utils to deal with timeseries 
processing"""
+
+dependencies {
+  compile library.java.guava
+  compile project(path: ":sdks:java:core", configuration: "shadow")
+  // Needed to run the Example.
+  compile project(path: ":runners:direct-java")
+  testCompile library.java.hamcrest_core
+  testCompile library.java.hamcrest_library
+  testCompile library.java.junit
+  //testRuntimeOnly project(path: ":runners:direct-java")
 
 Review comment:
   remove this line? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275666)
Time Spent: 1h 40m  (was: 1.5h)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275664&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275664
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 12/Jul/19 06:31
Start Date: 12/Jul/19 06:31
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#discussion_r302843161
 
 

 ##
 File path: 
sdks/java/extensions/timeseries/src/main/java/org/apache/beam/sdk/extensions/timeseries/joins/BiStreamJoinResultCoder.java
 ##
 @@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.timeseries.joins;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.Arrays;
+import java.util.List;
+import org.apache.beam.sdk.coders.*;
+import org.apache.beam.sdk.values.TimestampedValue;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.apache.beam.sdk.values.TypeParameter;
+
+public class BiStreamJoinResultCoder
 
 Review comment:
   `BiStreamJoinResultCoder` and `BiTemporalJoinResultCoder` are duplicates?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275664)
Time Spent: 1.5h  (was: 1h 20m)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-3595) Normalize URNs across SDKs and runners.

2019-07-11 Thread Chad Dombrova (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883566#comment-16883566
 ] 

Chad Dombrova commented on BEAM-3595:
-

I see that this is marked as fixed, but when using v2.13.0 of the python sdk I 
get bad a bad urn for pardo: "urn:beam:transform:pardo:v1" (also, there's a 
least a dozen mentions of this ticket in the source code).  In order to read my 
pipeline back into Java I have to remove the urn: prefix.  The urns for other 
transforms don't include the urn: prefix. 

Here's a bad pardo message:

{noformat}
  transforms {
key: "ref_AppliedPTransform_Sleep_4"
value {
  spec {
urn: "urn:beam:transform:pardo:v1"
payload: "\n\204\002\n\201\002\n 
beam:dofn:pickled_python_info:v1\032\3Jj0F4"
  }
  inputs {
key: "0"
value: "ref_PCollection_PCollection_1"
  }
  outputs {
key: "None"
value: "ref_PCollection_PCollection_2"
  }
  unique_name: "Sleep"
}
  }
{noformat}

Additionally, the coders seem to be authored incorrectly, with a double nested 
spec:

{noformat}
 coders {
key: "ref_Coder_VarIntCoder_1"
value {
  spec {
spec {
  urn: "beam:coder:varint:v1"
}
  }
}
  }
{noformat}

I have to remove one level of the spec to get it to read properly in Java.

To be clear, the issue that I'm describing occurs when I manually write and 
read the pipeline via protobufs: everything works fine when I submit using 
`pipe.run()`.  I surmise that there is some function in the python sdk that 
fixes up the pipeline message before sending it, otherwise I assume it _would_ 
be broken on receipt. Is that correct?





 

> Normalize URNs across SDKs and runners.
> ---
>
> Key: BEAM-3595
> URL: https://issues.apache.org/jira/browse/BEAM-3595
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Robert Bradshaw
>Assignee: Eugene Kirpichov
>Priority: Major
> Fix For: 2.5.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7467) Gearpump Quickstart fails, java.lang.NoClassDefFoundError: com/gs/collections/api/block/procedure/Procedure

2019-07-11 Thread Manu Zhang (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883496#comment-16883496
 ] 

Manu Zhang commented on BEAM-7467:
--

[~lcwik], can we close now ?

> Gearpump Quickstart fails, java.lang.NoClassDefFoundError: 
> com/gs/collections/api/block/procedure/Procedure
> ---
>
> Key: BEAM-7467
> URL: https://issues.apache.org/jira/browse/BEAM-7467
> Project: Beam
>  Issue Type: Bug
>  Components: runner-gearpump
>Affects Versions: 2.13.0
>Reporter: Luke Cwik
>Assignee: Manu Zhang
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> After generating the archetype for the 2.13.0 RC2, the following quick start 
> command fails:
> {code:java}
> mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount 
> -Dexec.args="--inputFile=pom.xml --output=counts --runner=GearpumpRunner" 
> -Pgearpump-runner{code}
> I also tried:
> {code:java}
> mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount 
> -Dexec.args="--inputFile=pom.xml --output=counts --runner=GearpumpRunner" 
> -Pgearpump-runner{code}
> Log:
> {code:java}
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/local/google/home/lcwik/.m2/repository/org/slf4j/slf4j-jdk14/1.7.25/slf4j-jdk14-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/local/google/home/lcwik/.m2/repository/org/slf4j/slf4j-log4j12/1.7.16/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
> May 31, 2019 9:38:26 AM akka.event.slf4j.Slf4jLogger$$anonfun$receive$1 
> applyOrElse
> INFO: Slf4jLogger started
> May 31, 2019 9:38:26 AM akka.event.slf4j.Slf4jLogger$$anonfun$receive$1 
> $anonfun$applyOrElse$3
> INFO: Starting remoting
> May 31, 2019 9:38:26 AM akka.event.slf4j.Slf4jLogger$$anonfun$receive$1 
> $anonfun$applyOrElse$3
> INFO: Remoting started; listening on addresses 
> :[akka.tcp://client1820912745@127.0.0.1:38849]
> May 31, 2019 9:38:26 AM io.gearpump.metrics.Metrics$ createExtension
> INFO: Metrics is enabled..., false
> May 31, 2019 9:38:26 AM io.gearpump.cluster.master.MasterProxy 
> INFO: Master Proxy is started...
> [WARNING] 
> java.lang.NoClassDefFoundError: 
> com/gs/collections/api/block/procedure/Procedure
>  at io.gearpump.streaming.dsl.plan.WindowOp.chain (OP.scala:273)
>  at io.gearpump.streaming.dsl.plan.Planner.merge (Planner.scala:86)
>  at io.gearpump.streaming.dsl.plan.Planner.$anonfun$optimize$2 
> (Planner.scala:71)
>  at io.gearpump.streaming.dsl.plan.Planner.$anonfun$optimize$2$adapted 
> (Planner.scala:70)
>  at scala.collection.mutable.HashSet.foreach (HashSet.scala:77)
>  at io.gearpump.streaming.dsl.plan.Planner.$anonfun$optimize$1 
> (Planner.scala:70)
>  at io.gearpump.streaming.dsl.plan.Planner.$anonfun$optimize$1$adapted 
> (Planner.scala:68)
>  at scala.collection.immutable.List.foreach (List.scala:388)
>  at io.gearpump.streaming.dsl.plan.Planner.optimize (Planner.scala:68)
>  at io.gearpump.streaming.dsl.plan.Planner.plan (Planner.scala:48)
>  at io.gearpump.streaming.dsl.scalaapi.StreamApp.plan (StreamApp.scala:59)
>  at io.gearpump.streaming.dsl.scalaapi.StreamApp$.streamAppToApplication 
> (StreamApp.scala:82)
>  at io.gearpump.streaming.dsl.javaapi.JavaStreamApp.submit 
> (JavaStreamApp.scala:44)
>  at org.apache.beam.runners.gearpump.GearpumpRunner.run 
> (GearpumpRunner.java:83)
>  at org.apache.beam.runners.gearpump.GearpumpRunner.run 
> (GearpumpRunner.java:44)
>  at org.apache.beam.sdk.Pipeline.run (Pipeline.java:313)
>  at org.apache.beam.sdk.Pipeline.run (Pipeline.java:299)
>  at org.apache.beam.examples.WordCount.runWordCount (WordCount.java:185)
>  at org.apache.beam.examples.WordCount.main (WordCount.java:192)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
>  at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke (Method.java:498)
>  at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
>  at java.lang.Thread.run (Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> com.gs.collections.api.block.procedure.Procedure
>  at java.net.URLClassLoader.findClass (URLClassLoader.java:381)
>  at java.lang.ClassLoader.loadClass (ClassLoader.java:424)
>  at java.lang.ClassLoader.loadClass (ClassLoader.java:357)
>  at io.gearpump.streaming.dsl.plan.WindowOp.chain (OP.scala:273)
>  at io.gearpump.streaming.

[jira] [Work logged] (BEAM-7719) Ensure that publishing vendored artifacts first validates there contents

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7719?focusedWorklogId=275627&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275627
 ]

ASF GitHub Bot logged work on BEAM-7719:


Author: ASF GitHub Bot
Created on: 12/Jul/19 02:56
Start Date: 12/Jul/19 02:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9036: [BEAM-7719] 
Ensure that publishing vendored artifacts checks the contents of the jar before 
publishing.
URL: https://github.com/apache/beam/pull/9036
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275627)
Time Spent: 1h 20m  (was: 1h 10m)

> Ensure that publishing vendored artifacts first validates there contents
> 
>
> Key: BEAM-7719
> URL: https://issues.apache.org/jira/browse/BEAM-7719
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> During the release of vendored guava 26.0, it was discovered that we don't 
> check the contents of the jars automatically.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-4948) Beam Dependency Update Request: com.google.guava

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-4948?focusedWorklogId=275628&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275628
 ]

ASF GitHub Bot logged work on BEAM-4948:


Author: ASF GitHub Bot
Created on: 12/Jul/19 02:56
Start Date: 12/Jul/19 02:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9038: [BEAM-4948, 
BEAM-6267, BEAM-5559, BEAM-7289] Fix shading of vendored guava to exclude 
classes from transitive dependencies which aren't needed at runtime
URL: https://github.com/apache/beam/pull/9038
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275628)
Time Spent: 2h  (was: 1h 50m)

> Beam Dependency Update Request: com.google.guava
> 
>
> Key: BEAM-4948
> URL: https://issues.apache.org/jira/browse/BEAM-4948
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> 2018-07-25 20:28:03.628639
> Please review and upgrade the com.google.guava to the latest version 
> None 
>  
> cc: 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-07-11 Thread sunjincheng (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883483#comment-16883483
 ] 

sunjincheng commented on BEAM-7730:
---

Hi [~mxm] I have been seen that the patch of Flink 1.7, 1.8 are contribution 
from you, I think you have a lot of experience in this area, I am interested to 
contribute beam on this ticket, So I appreciate it if you can give me some 
suggestions and guides. :)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.14.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-07-11 Thread sunjincheng (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-7730:
--
Description: 
Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target and 
make Flink Runner compatible with Flink 1.9.
I will add the brief changes after the Flink 1.9.0 released. 

And I appreciate it if you can leave your suggestions or comments!

  was:
Apache Flink 1.9 will coming and it's better to make Flink Runner compatible 
with Flink 1.9.
I will add the brief changes after the Flink 1.9.0 released. 

And I appreciate it if you can leave your suggestions or comments!


> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.14.0
>
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-07-11 Thread sunjincheng (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-7730:
--
Summary: Add Flink 1.9 build target and Make FlinkRunner compatible with 
Flink 1.9  (was: Make FlinkRunner compatible with Flink 1.9)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.14.0
>
>
> Apache Flink 1.9 will coming and it's better to make Flink Runner compatible 
> with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7730) Make FlinkRunner compatible with Flink 1.9

2019-07-11 Thread sunjincheng (JIRA)

sunjincheng created BEAM-7730:
-

 Summary: Make FlinkRunner compatible with Flink 1.9
 Key: BEAM-7730
 URL: https://issues.apache.org/jira/browse/BEAM-7730
 Project: Beam
  Issue Type: New Feature
  Components: runner-flink
Reporter: sunjincheng
Assignee: sunjincheng
 Fix For: 2.14.0


Apache Flink 1.9 will coming and it's better to make Flink Runner compatible 
with Flink 1.9.
I will add the brief changes after the Flink 1.9.0 released. 

And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7689) Temporary directory for WriteOperation may not be unique in FileBaseSink

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7689?focusedWorklogId=275626&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275626
 ]

ASF GitHub Bot logged work on BEAM-7689:


Author: ASF GitHub Bot
Created on: 12/Jul/19 02:20
Start Date: 12/Jul/19 02:20
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #9039: [BEAM-7689] 
cherry-picking for 2.14.0
URL: https://github.com/apache/beam/pull/9039#issuecomment-510718962
 
 
   It doesn't look like a flake. It fails consistently and looking at the tests 
seem to utilize TextIO so this change to FileBasedSink seems relevant
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275626)
Time Spent: 2h 50m  (was: 2h 40m)

> Temporary directory for WriteOperation may not be unique in FileBaseSink
> 
>
> Key: BEAM-7689
> URL: https://issues.apache.org/jira/browse/BEAM-7689
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.14.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Temporary directory for WriteOperation in FileBasedSink is generated from a 
> second-granularity timestamp (-MM-dd_HH-mm-ss) and unique increasing 
> index. Such granularity is not enough to make temporary directories unique 
> between different jobs. When two jobs share the same temporary directory, 
> output file may not be produced in one job since the required temporary file 
> can be deleted from another job.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7546) Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7546?focusedWorklogId=275616&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275616
 ]

ASF GitHub Bot logged work on BEAM-7546:


Author: ASF GitHub Bot
Created on: 12/Jul/19 01:56
Start Date: 12/Jul/19 01:56
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9046: [BEAM-7546] 
Increasing environment cache to avoid chances of recreati…
URL: https://github.com/apache/beam/pull/9046#issuecomment-510714732
 
 
   LGTM, thank you
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275616)
Time Spent: 0.5h  (was: 20m)

> Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.
> 
>
> Key: BEAM-7546
> URL: https://issues.apache.org/jira/browse/BEAM-7546
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Valentyn Tymofieiev
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> On a few occasions I see this test fail due to a temp directory being missing.
> Sample scan from 
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/745/: 
> https://scans.gradle.com/s/3ra4xw4hqvlyw/console-log?task=:sdks:python:portableWordCountBatch
> {noformat}
> [grpc-default-executor-0] ERROR sdk_worker._execute - Error processing 
> instruction 8. Original traceback is
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
> response = task()
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in 
> self._execute(lambda: worker.do_instruction(work), work)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> request.instruction_id)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> bundle_processor.process_bundle(instruction_id))
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 589, in process_bundle
> ].process_encoded(data.data)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 143, in process_encoded
> self.output(decoded_value)
>   File "apache_beam/runners/worker/operations.py", line 255, in 
> apache_beam.runners.worker.operations.Operation.output
> def output(self, windowed_value, output_index=0):
>   File "apache_beam/runners/worker/operations.py", line 256, in 
> apache_beam.runners.worker.operations.Operation.output
> cython.cast(Receiver, 
> self.receivers[output_index]).receive(windowed_value)
>   File "apache_beam/runners/worker/operations.py", line 143, in 
> apache_beam.runners.worker.operations.SingletonConsumerSet.receive
> self.consumer.process(windowed_value)
>   File "apache_beam/runners/worker/operations.py", line 593, in 
> apache_beam.runners.worker.operations.DoOperation.process
> with self.scoped_process_state:
>   File "apache_beam/runners/worker/operations.py", line 594, in 
> apache_beam.runners.worker.operations.DoOperation.process
> delayed_application = self.dofn_receiver.receive(o)
>   File "apache_beam/runners/common.py", line 778, in 
> apache_beam.runners.common.DoFnRun
> ner.receive
> self.process(windowed_value)
>   File "apache_beam/runners/common.py", line 784, in 
> apache_beam.runners.common.DoFnRunner.process
> self._reraise_augmented(exn)
>   File "apache_beam/runners/common.py", line 851, in 
> apache_beam.runners.common.DoFnRunner._reraise_augmented
> raise_with_traceback(new_exn)
>   File "apache_beam/runners/common.py", line 782, in 
> apache_beam.runners.common.DoFnRunner.process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File "apache_beam/runners/common.py", line 594, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
> self._invoke_process_per_window(
>   File "apache_beam/runners/common.py", line 666, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
> windowed_value, self.process_method(*args_for_process))
>   File "/usr/local/lib/python2.7/site-packages/apache_beam/io/iobase.py", 
> line 1041, in process
> self.writer = self.sink.ope

[jira] [Work logged] (BEAM-7546) Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7546?focusedWorklogId=275617&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275617
 ]

ASF GitHub Bot logged work on BEAM-7546:


Author: ASF GitHub Bot
Created on: 12/Jul/19 01:56
Start Date: 12/Jul/19 01:56
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9046: [BEAM-7546] 
Increasing environment cache to avoid chances of recreati…
URL: https://github.com/apache/beam/pull/9046#issuecomment-510714835
 
 
   cc: @udim 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275617)
Time Spent: 40m  (was: 0.5h)

> Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.
> 
>
> Key: BEAM-7546
> URL: https://issues.apache.org/jira/browse/BEAM-7546
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Valentyn Tymofieiev
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> On a few occasions I see this test fail due to a temp directory being missing.
> Sample scan from 
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/745/: 
> https://scans.gradle.com/s/3ra4xw4hqvlyw/console-log?task=:sdks:python:portableWordCountBatch
> {noformat}
> [grpc-default-executor-0] ERROR sdk_worker._execute - Error processing 
> instruction 8. Original traceback is
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
> response = task()
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in 
> self._execute(lambda: worker.do_instruction(work), work)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> request.instruction_id)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> bundle_processor.process_bundle(instruction_id))
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 589, in process_bundle
> ].process_encoded(data.data)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 143, in process_encoded
> self.output(decoded_value)
>   File "apache_beam/runners/worker/operations.py", line 255, in 
> apache_beam.runners.worker.operations.Operation.output
> def output(self, windowed_value, output_index=0):
>   File "apache_beam/runners/worker/operations.py", line 256, in 
> apache_beam.runners.worker.operations.Operation.output
> cython.cast(Receiver, 
> self.receivers[output_index]).receive(windowed_value)
>   File "apache_beam/runners/worker/operations.py", line 143, in 
> apache_beam.runners.worker.operations.SingletonConsumerSet.receive
> self.consumer.process(windowed_value)
>   File "apache_beam/runners/worker/operations.py", line 593, in 
> apache_beam.runners.worker.operations.DoOperation.process
> with self.scoped_process_state:
>   File "apache_beam/runners/worker/operations.py", line 594, in 
> apache_beam.runners.worker.operations.DoOperation.process
> delayed_application = self.dofn_receiver.receive(o)
>   File "apache_beam/runners/common.py", line 778, in 
> apache_beam.runners.common.DoFnRun
> ner.receive
> self.process(windowed_value)
>   File "apache_beam/runners/common.py", line 784, in 
> apache_beam.runners.common.DoFnRunner.process
> self._reraise_augmented(exn)
>   File "apache_beam/runners/common.py", line 851, in 
> apache_beam.runners.common.DoFnRunner._reraise_augmented
> raise_with_traceback(new_exn)
>   File "apache_beam/runners/common.py", line 782, in 
> apache_beam.runners.common.DoFnRunner.process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File "apache_beam/runners/common.py", line 594, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
> self._invoke_process_per_window(
>   File "apache_beam/runners/common.py", line 666, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
> windowed_value, self.process_method(*args_for_process))
>   File "/usr/local/lib/python2.7/site-packages/apache_beam/io/iobase.py", 
> line 1041, in process
> self.writer = self.sink.open_writ

[jira] [Commented] (BEAM-2264) Re-use credential instead of generating a new one one each GCS call

2019-07-11 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883456#comment-16883456
 ] 

Udi Meiri commented on BEAM-2264:
-

In https://issues.apache.org/jira/browse/BEAM-3990 we found out that the 
storage client might not be thread safe.

Perhaps though we can reuse the credentials instead of recreating them every 
time.

> Re-use credential instead of generating a new one one each GCS call
> ---
>
> Key: BEAM-2264
> URL: https://issues.apache.org/jira/browse/BEAM-2264
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Luke Cwik
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We should cache the credential used within a Pipeline and re-use it instead 
> of generating a new one on each GCS call. When executing (against 2.0.0 RC2):
> {code}
> python -m apache_beam.examples.wordcount --input 
> "gs://dataflow-samples/shakespeare/*" --output local_counts
> {code}
> Note that we seemingly generate a new access token each time instead of when 
> a refresh is required.
> {code}
>   super(GcsIO, cls).__new__(cls, storage_client))
> INFO:root:Starting the size estimation of the input
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:root:Finished the size estimation of the input at 1 files. Estimation 
> took 0.286200046539 seconds
> INFO:root:Running pipeline with DirectRunner.
> INFO:root:Starting the size estimation of the input
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:root:Finished the size estimation of the input at 43 files. Estimation 
> took 0.205624818802 seconds
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> INFO:oauth2client.client:Refreshing access_token
> INFO:oauth2client.transport:Attempting refresh to obtain initial access_token
> ... many more times ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7546) Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7546?focusedWorklogId=275615&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275615
 ]

ASF GitHub Bot logged work on BEAM-7546:


Author: ASF GitHub Bot
Created on: 12/Jul/19 01:39
Start Date: 12/Jul/19 01:39
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #9046: [BEAM-7546] 
Increasing environment cache to avoid chances of recreati…
URL: https://github.com/apache/beam/pull/9046#issuecomment-510711883
 
 
   R: @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275615)
Time Spent: 20m  (was: 10m)

> Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.
> 
>
> Key: BEAM-7546
> URL: https://issues.apache.org/jira/browse/BEAM-7546
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Valentyn Tymofieiev
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> On a few occasions I see this test fail due to a temp directory being missing.
> Sample scan from 
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/745/: 
> https://scans.gradle.com/s/3ra4xw4hqvlyw/console-log?task=:sdks:python:portableWordCountBatch
> {noformat}
> [grpc-default-executor-0] ERROR sdk_worker._execute - Error processing 
> instruction 8. Original traceback is
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
> response = task()
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in 
> self._execute(lambda: worker.do_instruction(work), work)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> request.instruction_id)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> bundle_processor.process_bundle(instruction_id))
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 589, in process_bundle
> ].process_encoded(data.data)
>   File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 143, in process_encoded
> self.output(decoded_value)
>   File "apache_beam/runners/worker/operations.py", line 255, in 
> apache_beam.runners.worker.operations.Operation.output
> def output(self, windowed_value, output_index=0):
>   File "apache_beam/runners/worker/operations.py", line 256, in 
> apache_beam.runners.worker.operations.Operation.output
> cython.cast(Receiver, 
> self.receivers[output_index]).receive(windowed_value)
>   File "apache_beam/runners/worker/operations.py", line 143, in 
> apache_beam.runners.worker.operations.SingletonConsumerSet.receive
> self.consumer.process(windowed_value)
>   File "apache_beam/runners/worker/operations.py", line 593, in 
> apache_beam.runners.worker.operations.DoOperation.process
> with self.scoped_process_state:
>   File "apache_beam/runners/worker/operations.py", line 594, in 
> apache_beam.runners.worker.operations.DoOperation.process
> delayed_application = self.dofn_receiver.receive(o)
>   File "apache_beam/runners/common.py", line 778, in 
> apache_beam.runners.common.DoFnRun
> ner.receive
> self.process(windowed_value)
>   File "apache_beam/runners/common.py", line 784, in 
> apache_beam.runners.common.DoFnRunner.process
> self._reraise_augmented(exn)
>   File "apache_beam/runners/common.py", line 851, in 
> apache_beam.runners.common.DoFnRunner._reraise_augmented
> raise_with_traceback(new_exn)
>   File "apache_beam/runners/common.py", line 782, in 
> apache_beam.runners.common.DoFnRunner.process
> return self.do_fn_invoker.invoke_process(windowed_value)
>   File "apache_beam/runners/common.py", line 594, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
> self._invoke_process_per_window(
>   File "apache_beam/runners/common.py", line 666, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
> windowed_value, self.process_method(*args_for_process))
>   File "/usr/local/lib/python2.7/site-packages/apache_beam/io/iobase.py", 
> line 1041, in process
> self.writer = self.sink.open_wr

[jira] [Work logged] (BEAM-7546) Portable WordCount-on-Flink Precommit is flaky - temporary folder not found.

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7546?focusedWorklogId=275614&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275614
 ]

ASF GitHub Bot logged work on BEAM-7546:


Author: ASF GitHub Bot
Created on: 12/Jul/19 01:38
Start Date: 12/Jul/19 01:38
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #9046: [BEAM-7546] 
Increasing environment cache to avoid chances of recreati…
URL: https://github.com/apache/beam/pull/9046
 
 
   …ng the environment
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job

[jira] [Closed] (BEAM-3990) Dataflow jobs fail with "KeyError: 'location'" when uploading to GCS

2019-07-11 Thread Udi Meiri (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri closed BEAM-3990.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Dataflow jobs fail with "KeyError: 'location'" when uploading to GCS
> 
>
> Key: BEAM-3990
> URL: https://issues.apache.org/jira/browse/BEAM-3990
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some Dataflow jobs are failing due to following error (in worker logs).
>  
> Error in _start_upload while inserting file 
> gs://cloud-ml-benchmark-output-us-central/df1-cloudml-benchmark-criteo-small-python-033010274088282-presubmit3/033010274088282/temp/df1-cloudml-benchmark-criteo-small-python-033010274088282-presubmit3.1522430898.446147/dax-tmp-2018-03-30_10_28_40-14595186994726940229-S241-1-dc87ef69274882bf/tmp-dc87ef6927488c5a-shard--try-308ae8b3268d12b2-endshard.avro:
>  Traceback (most recent call last): File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/gcsio.py", line 
> 559, in _start_upload self._client.objects.Insert(self._insert_request, 
> upload=self._upload) File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py",
>  line 971, in Insert download=download) File 
> "/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py", line 
> 706, in _RunMethod http_request, client=self.client) File 
> "/usr/local/lib/python2.7/dist-packages/apitools/base/py/transfer.py", line 
> 860, in InitializeUpload url = 
> [http_response.info|https://www.google.com/url?q=http://http_response.info&sa=D&usg=AFQjCNGvYHYJBb_G4YNo3VvGoqX2Gq-6Yw]['location']
>  KeyError: 'location'
>  
> This seems to be due to [https://github.com/apache/beam/pull/4891.] Possibly 
> storage.StorageV1() cannot be shared across multiple requests without 
> additional fixes.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7689) Temporary directory for WriteOperation may not be unique in FileBaseSink

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7689?focusedWorklogId=275606&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275606
 ]

ASF GitHub Bot logged work on BEAM-7689:


Author: ASF GitHub Bot
Created on: 12/Jul/19 01:21
Start Date: 12/Jul/19 01:21
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #9039: [BEAM-7689] 
cherry-picking for 2.14.0
URL: https://github.com/apache/beam/pull/9039#issuecomment-510708727
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275606)
Time Spent: 2h 40m  (was: 2.5h)

> Temporary directory for WriteOperation may not be unique in FileBaseSink
> 
>
> Key: BEAM-7689
> URL: https://issues.apache.org/jira/browse/BEAM-7689
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.14.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Temporary directory for WriteOperation in FileBasedSink is generated from a 
> second-granularity timestamp (-MM-dd_HH-mm-ss) and unique increasing 
> index. Such granularity is not enough to make temporary directories unique 
> between different jobs. When two jobs share the same temporary directory, 
> output file may not be produced in one job since the required temporary file 
> can be deleted from another job.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7680) synthetic_pipeline_test.py flaky

2019-07-11 Thread Chamikara Jayalath (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883424#comment-16883424
 ] 

Chamikara Jayalath commented on BEAM-7680:
--

Sorry, this is from code I wrote several years back and I lost context.

I think comment was about test being time based on general (and hence there's a 
possibility of flakes).

> synthetic_pipeline_test.py flaky
> 
>
> Key: BEAM-7680
> URL: https://issues.apache.org/jira/browse/BEAM-7680
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Udi Meiri
>Assignee: Kasia Kucharczyk
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java}
> 11:51:43 FAIL: testSyntheticSDFStep 
> (apache_beam.testing.synthetic_pipeline_test.SyntheticPipelineTest)
> 11:51:43 
> --
> 11:51:43 Traceback (most recent call last):
> 11:51:43   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/testing/synthetic_pipeline_test.py",
>  line 82, in testSyntheticSDFStep
> 11:51:43 self.assertTrue(0.5 <= elapsed <= 3, elapsed)
> 11:51:43 AssertionError: False is not true : 3.659700632095337{code}
> [https://builds.apache.org/job/beam_PreCommit_Python_Cron/1502/consoleFull]
>  
> Two flaky TODOs: 
> [https://github.com/apache/beam/blob/b79f24ced1c8519c29443ea7109c59ad18be2ebe/sdks/python/apache_beam/testing/synthetic_pipeline_test.py#L69-L82]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=275561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275561
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 12/Jul/19 00:11
Start Date: 12/Jul/19 00:11
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9044: [BEAM-7437] Raise 
RuntimeError for PY2 in BigqueryFullResultStreamingMatcher
URL: https://github.com/apache/beam/pull/9044#discussion_r302787083
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/tests/bigquery_matcher.py
 ##
 @@ -192,7 +193,10 @@ def _get_query_result(self, bigquery_client):
   if len(response) >= len(self.expected_data):
 return response
   time.sleep(1)
-raise TimeoutError('Timeout exceeded for matcher.')
+if six.PY3:
 
 Review comment:
   We've decided against using `six` in the Beam codebase. You can check for 
Python version using this code instead:
   ```suggestion
   if sys.version_info >= (3,):
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275561)
Time Spent: 6h 40m  (was: 6.5h)

> Integration Test for BQ streaming inserts for streaming pipelines
> -
>
> Key: BEAM-7437
> URL: https://issues.apache.org/jira/browse/BEAM-7437
> Project: Beam
>  Issue Type: Test
>  Components: io-python-gcp
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Labels: test
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Integration Test for BigQuery Sink using Streaming Inserts for streaming 
> pipelines.
> Integration tests currently exist for batch pipelines, it can also be added 
> for streaming pipelines using TestStream. This will be a precursor to the 
> failing integration test to be added for [BEAM-6611| 
> https://issues.apache.org/jira/browse/BEAM-6611].



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7726) [Go SDK] State Backed Iterables

2019-07-11 Thread Robert Burke (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke updated BEAM-7726:
---
Status: Open  (was: Triage Needed)

> [Go SDK] State Backed Iterables
> ---
>
> Key: BEAM-7726
> URL: https://issues.apache.org/jira/browse/BEAM-7726
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
> Fix For: Not applicable
>
>
> The Go SDK should support the State backed iterables protocol per the proto.
> [https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L644]
>  
> Primary case is for iterables after CoGBKs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7729) Converting Nullable null values in BigQuery

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7729?focusedWorklogId=275554&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275554
 ]

ASF GitHub Bot logged work on BEAM-7729:


Author: ASF GitHub Bot
Created on: 11/Jul/19 23:47
Start Date: 11/Jul/19 23:47
Worklog Time Spent: 10m 
  Work Description: riazela commented on issue #9045: [BEAM-7729] Fixes the 
bug by checking the value first before parsing it.
URL: https://github.com/apache/beam/pull/9045#issuecomment-510693206
 
 
   R: @akedin 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275554)
Time Spent: 20m  (was: 10m)

> Converting Nullable null values in BigQuery 
> 
>
> Key: BEAM-7729
> URL: https://issues.apache.org/jira/browse/BEAM-7729
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Alireza Samadianzakaria
>Assignee: Alireza Samadianzakaria
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils#convertAvroFormat we do 
> not check if the value is null and we pass it to the other methods and if 
> that value is null, we will get a null pointer exception. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7729) Converting Nullable null values in BigQuery

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7729?focusedWorklogId=275553&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275553
 ]

ASF GitHub Bot logged work on BEAM-7729:


Author: ASF GitHub Bot
Created on: 11/Jul/19 23:47
Start Date: 11/Jul/19 23:47
Worklog Time Spent: 10m 
  Work Description: riazela commented on pull request #9045: [BEAM-7729] 
Fixes the bug by checking the value first before parsing it.
URL: https://github.com/apache/beam/pull/9045
 
 
   [BEAM-7729] Fixes the bug by checking the value first before parsing it. The 
approach is similar to parsing the text table rows. 
(org.apache.beam.sdk.extensions.sql.impl.schema.BeamTableUtils#autoCastField)
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://b

[jira] [Commented] (BEAM-7680) synthetic_pipeline_test.py flaky

2019-07-11 Thread Pablo Estrada (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883404#comment-16883404
 ] 

Pablo Estrada commented on BEAM-7680:
-

[~chamikara] you're mentioned in the TODOs, WDYT?

> synthetic_pipeline_test.py flaky
> 
>
> Key: BEAM-7680
> URL: https://issues.apache.org/jira/browse/BEAM-7680
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Udi Meiri
>Assignee: Kasia Kucharczyk
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java}
> 11:51:43 FAIL: testSyntheticSDFStep 
> (apache_beam.testing.synthetic_pipeline_test.SyntheticPipelineTest)
> 11:51:43 
> --
> 11:51:43 Traceback (most recent call last):
> 11:51:43   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/testing/synthetic_pipeline_test.py",
>  line 82, in testSyntheticSDFStep
> 11:51:43 self.assertTrue(0.5 <= elapsed <= 3, elapsed)
> 11:51:43 AssertionError: False is not true : 3.659700632095337{code}
> [https://builds.apache.org/job/beam_PreCommit_Python_Cron/1502/consoleFull]
>  
> Two flaky TODOs: 
> [https://github.com/apache/beam/blob/b79f24ced1c8519c29443ea7109c59ad18be2ebe/sdks/python/apache_beam/testing/synthetic_pipeline_test.py#L69-L82]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7729) Converting Nullable null values in BigQuery

2019-07-11 Thread Alireza Samadianzakaria (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alireza Samadianzakaria updated BEAM-7729:
--
Description: In 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils#convertAvroFormat we do not 
check if the value is null and we pass it to the other methods and if that 
value is null, we will get a null pointer exception.   (was: In 
org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils#convertAvroFormat we do not 
check if the value is null and we pass it to the other methods and if that 
value is null, we will get a null pointer exception. The fix will be simple, 
just check if the value is null and if the field is nullable, then return null 
in that case.)

> Converting Nullable null values in BigQuery 
> 
>
> Key: BEAM-7729
> URL: https://issues.apache.org/jira/browse/BEAM-7729
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Alireza Samadianzakaria
>Assignee: Alireza Samadianzakaria
>Priority: Major
>
> In org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils#convertAvroFormat we do 
> not check if the value is null and we pass it to the other methods and if 
> that value is null, we will get a null pointer exception. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7729) Converting Nullable null values in BigQuery

2019-07-11 Thread Alireza Samadianzakaria (JIRA)

Alireza Samadianzakaria created BEAM-7729:
-

 Summary: Converting Nullable null values in BigQuery 
 Key: BEAM-7729
 URL: https://issues.apache.org/jira/browse/BEAM-7729
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp
Reporter: Alireza Samadianzakaria
Assignee: Alireza Samadianzakaria


In org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils#convertAvroFormat we do 
not check if the value is null and we pass it to the other methods and if that 
value is null, we will get a null pointer exception. The fix will be simple, 
just check if the value is null and if the field is nullable, then return null 
in that case.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7463) Bigquery IO ITs are flaky: incorrect checksum

2019-07-11 Thread Pablo Estrada (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883398#comment-16883398
 ] 

Pablo Estrada commented on BEAM-7463:
-

taking a look...

> Bigquery IO ITs are flaky: incorrect checksum
> -
>
> Key: BEAM-7463
> URL: https://issues.apache.org/jira/browse/BEAM-7463
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types 
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38 
> --
> 15:03:38 Traceback (most recent call last):
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
>  line 211, in test_big_query_new_types
> 15:03:38 big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
>  line 82, in run_bq_pipeline
> 15:03:38 result = p.run()
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 15:03:38 else test_runner_api))
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 15:03:38 self._options).run(False)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 15:03:38 return self.runner.run_pipeline(self, self._options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 51, in run_pipeline
> 15:03:38 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError: 
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and 
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38  but: Expected checksum is 
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is 
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher? 
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>  
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well: 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/1000/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-7545) Row Count Estimation for CSV TextTable

2019-07-11 Thread Alireza Samadianzakaria (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alireza Samadianzakaria resolved BEAM-7545.
---
   Resolution: Done
Fix Version/s: Not applicable

> Row Count Estimation for CSV TextTable
> --
>
> Key: BEAM-7545
> URL: https://issues.apache.org/jira/browse/BEAM-7545
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Alireza Samadianzakaria
>Assignee: Alireza Samadianzakaria
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Implementing Row Count Estimation for CSV Tables by reading the first few 
> lines of the file and estimating the number of records based on the length of 
> these lines and the total length of the file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7728) Support ParquetTable in SQL

2019-07-11 Thread Kai Jiang (JIRA)

Kai Jiang created BEAM-7728:
---

 Summary: Support ParquetTable in SQL
 Key: BEAM-7728
 URL: https://issues.apache.org/jira/browse/BEAM-7728
 Project: Beam
  Issue Type: New Feature
  Components: dsl-sql
Reporter: Kai Jiang
Assignee: Kai Jiang






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7463) Bigquery IO ITs are flaky: incorrect checksum

2019-07-11 Thread Valentyn Tymofieiev (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883391#comment-16883391
 ] 

Valentyn Tymofieiev commented on BEAM-7463:
---

[~pabloem] could you please take a look into BQ  suites? your expertise in BQ 
connector may help us track this down faster. 
[~Juta], Do you remember why we removed sorting in Bigquery matcher? See: 
https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134.
 Does BQ guarantee some order of returned result? if not, then that may very 
well cause the checksum errors, but there are also other errors with this test 
like table not found and  but "but" field is empty, perhaps we should 
investigate these separately.

Also, to echo some points from conversation on 
https://github.com/apache/beam/pull/8751 for visibility:  class attributes 
assigned in setup/teardown method are not shared between  execution of 
different tests cases defined in the same test class, so the race of table 
cleanup is not obvious to me.

> Bigquery IO ITs are flaky: incorrect checksum
> -
>
> Key: BEAM-7463
> URL: https://issues.apache.org/jira/browse/BEAM-7463
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types 
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38 
> --
> 15:03:38 Traceback (most recent call last):
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
>  line 211, in test_big_query_new_types
> 15:03:38 big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
>  line 82, in run_bq_pipeline
> 15:03:38 result = p.run()
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 15:03:38 else test_runner_api))
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 15:03:38 self._options).run(False)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 15:03:38 return self.runner.run_pipeline(self, self._options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 51, in run_pipeline
> 15:03:38 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError: 
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and 
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38  but: Expected checksum is 
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is 
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher? 
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>  
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well: 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/1000/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-7463) Bigquery IO ITs are flaky: incorrect checksum

2019-07-11 Thread Valentyn Tymofieiev (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-7463:
-

Assignee: Pablo Estrada

> Bigquery IO ITs are flaky: incorrect checksum
> -
>
> Key: BEAM-7463
> URL: https://issues.apache.org/jira/browse/BEAM-7463
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types 
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38 
> --
> 15:03:38 Traceback (most recent call last):
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
>  line 211, in test_big_query_new_types
> 15:03:38 big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
>  line 82, in run_bq_pipeline
> 15:03:38 result = p.run()
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 15:03:38 else test_runner_api))
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 15:03:38 self._options).run(False)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 15:03:38 return self.runner.run_pipeline(self, self._options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 51, in run_pipeline
> 15:03:38 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError: 
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and 
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38  but: Expected checksum is 
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is 
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher? 
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>  
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well: 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/1000/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7689) Temporary directory for WriteOperation may not be unique in FileBaseSink

2019-07-11 Thread Anton Kedin (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883375#comment-16883375
 ] 

Anton Kedin commented on BEAM-7689:
---

Fails in Dataflow examples test

> Temporary directory for WriteOperation may not be unique in FileBaseSink
> 
>
> Key: BEAM-7689
> URL: https://issues.apache.org/jira/browse/BEAM-7689
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.14.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Temporary directory for WriteOperation in FileBasedSink is generated from a 
> second-granularity timestamp (-MM-dd_HH-mm-ss) and unique increasing 
> index. Such granularity is not enough to make temporary directories unique 
> between different jobs. When two jobs share the same temporary directory, 
> output file may not be produced in one job since the required temporary file 
> can be deleted from another job.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=275535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275535
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 11/Jul/19 22:27
Start Date: 11/Jul/19 22:27
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #9044: [BEAM-7437] Raise 
RuntimeError for PY2 in BigqueryFullResultStreamingMatcher
URL: https://github.com/apache/beam/pull/9044#issuecomment-510676748
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275535)
Time Spent: 6.5h  (was: 6h 20m)

> Integration Test for BQ streaming inserts for streaming pipelines
> -
>
> Key: BEAM-7437
> URL: https://issues.apache.org/jira/browse/BEAM-7437
> Project: Beam
>  Issue Type: Test
>  Components: io-python-gcp
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Labels: test
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Integration Test for BigQuery Sink using Streaming Inserts for streaming 
> pipelines.
> Integration tests currently exist for batch pipelines, it can also be added 
> for streaming pipelines using TestStream. This will be a precursor to the 
> failing integration test to be added for [BEAM-6611| 
> https://issues.apache.org/jira/browse/BEAM-6611].



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-6783) byte[] breaks in BeamSQL codegen

2019-07-11 Thread Rui Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-6783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned BEAM-6783:
--

Assignee: (was: Andrew Pilloud)

> byte[] breaks in BeamSQL codegen
> 
>
> Key: BEAM-6783
> URL: https://issues.apache.org/jira/browse/BEAM-6783
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Rui Wang
>Priority: Major
>
> Calcite will call `byte[].toString` because BeamSQL codegen read byte[] from 
> Row to calcite (see: 
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java#L334).
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=275534&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275534
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 11/Jul/19 22:20
Start Date: 11/Jul/19 22:20
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #9044: [BEAM-7437] Raise 
RuntimeError for PY2 in BigqueryFullResultStreamingMatcher
URL: https://github.com/apache/beam/pull/9044#issuecomment-510675038
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275534)
Time Spent: 6h 20m  (was: 6h 10m)

> Integration Test for BQ streaming inserts for streaming pipelines
> -
>
> Key: BEAM-7437
> URL: https://issues.apache.org/jira/browse/BEAM-7437
> Project: Beam
>  Issue Type: Test
>  Components: io-python-gcp
>Affects Versions: 2.12.0
>Reporter: Tanay Tummalapalli
>Assignee: Tanay Tummalapalli
>Priority: Minor
>  Labels: test
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Integration Test for BigQuery Sink using Streaming Inserts for streaming 
> pipelines.
> Integration tests currently exist for batch pipelines, it can also be added 
> for streaming pipelines using TestStream. This will be a precursor to the 
> failing integration test to be added for [BEAM-6611| 
> https://issues.apache.org/jira/browse/BEAM-6611].



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-7548) test_approximate_unique_global_by_error is flaky

2019-07-11 Thread Hannah Jiang (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Jiang resolved BEAM-7548.

   Resolution: Fixed
Fix Version/s: 2.14.0

> test_approximate_unique_global_by_error is flaky
> 
>
> Key: BEAM-7548
> URL: https://issues.apache.org/jira/browse/BEAM-7548
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> The error happened on Jenkins in Python 3.5 suite, which currently uses 
> Python 3.5.2 interpreter:
> {noformat}
> 11:57:47 
> ==
> 11:57:47 ERROR: test_approximate_unique_global_by_error 
> (apache_beam.transforms.stats_test.ApproximateUniqueTest)
> 11:57:47 
> --
> 11:57:47 Traceback (most recent call last):
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/transforms/stats_test.py",
>  line 236, in test_approximate_unique_global_by_error
> 11:57:47 pipeline.run()
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 11:57:47 else test_runner_api))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 11:57:47 self._options).run(False)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 11:57:47 return self.runner.run_pipeline(self, self._options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 128, in run_pipeline
> 11:57:47 return runner.run_pipeline(pipeline, options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 289, in run_pipeline
> 11:57:47 default_environment=self._default_environment))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 293, in run_via_runner_api
> 11:57:47 return self.run_stages(*self.create_stages(pipeline_proto))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 369, in run_stages
> 11:57:47 stage_context.safe_coders)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 531, in run_stage
> 11:57:47 data_input, data_output)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1235, in process_bundle
> 11:57:47 result_future = 
> self._controller.control_handler.push(process_bundle)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 851, in push
> 11:57:47 response = self.worker.do_instruction(request)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> 11:57:47 request.instruction_id)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> 11:57:47 bundle_processor.process_bundle(instruction_id))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/worker/bundle_processor.py",

[jira] [Work logged] (BEAM-7437) Integration Test for BQ streaming inserts for streaming pipelines

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7437?focusedWorklogId=275527&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275527
 ]

ASF GitHub Bot logged work on BEAM-7437:


Author: ASF GitHub Bot
Created on: 11/Jul/19 21:58
Start Date: 11/Jul/19 21:58
Worklog Time Spent: 10m 
  Work Description: ttanay commented on pull request #9044: [BEAM-7437] 
Raise RuntimeError for PY2 in BigqueryFullResultStreamingMatcher
URL: https://github.com/apache/beam/pull/9044
 
 
   TimeoutError is not a built-in exception in Python 2.
   Raise RuntimeError in Python 2.
   
   The tests passed in the original PR[1], but the PreCommit started failing 
today because of this.
   
   [1] https://github.com/apache/beam/pull/8934
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status

[jira] [Work logged] (BEAM-7689) Temporary directory for WriteOperation may not be unique in FileBaseSink

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7689?focusedWorklogId=275523&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275523
 ]

ASF GitHub Bot logged work on BEAM-7689:


Author: ASF GitHub Bot
Created on: 11/Jul/19 21:47
Start Date: 11/Jul/19 21:47
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #9039: [BEAM-7689] 
cherry-picking for 2.14.0
URL: https://github.com/apache/beam/pull/9039#issuecomment-510666736
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275523)
Time Spent: 2.5h  (was: 2h 20m)

> Temporary directory for WriteOperation may not be unique in FileBaseSink
> 
>
> Key: BEAM-7689
> URL: https://issues.apache.org/jira/browse/BEAM-7689
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.14.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Temporary directory for WriteOperation in FileBasedSink is generated from a 
> second-granularity timestamp (-MM-dd_HH-mm-ss) and unique increasing 
> index. Such granularity is not enough to make temporary directories unique 
> between different jobs. When two jobs share the same temporary directory, 
> output file may not be produced in one job since the required temporary file 
> can be deleted from another job.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-7727) Python RestrictionTracker should have a synchronization wrapper like Java

2019-07-11 Thread Boyuan Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyuan Zhang resolved BEAM-7727.

   Resolution: Duplicate
Fix Version/s: Not applicable

Duplicated with https://issues.apache.org/jira/browse/BEAM-7473.

> Python RestrictionTracker should have a synchronization wrapper like Java
> -
>
> Key: BEAM-7727
> URL: https://issues.apache.org/jira/browse/BEAM-7727
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: Not applicable
>
>
> Currently Python RestrictionTracker manages synchronization, which is not 
> easy for sdk user to implemented. Python SDK should provide a same wrapper 
> like Java to help manage synchronization: 
> https://github.com/apache/beam/pull/6467



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7727) Python RestrictionTracker should have a synchronization wrapper like Java

2019-07-11 Thread Boyuan Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyuan Zhang updated BEAM-7727:
---
Status: Open  (was: Triage Needed)

> Python RestrictionTracker should have a synchronization wrapper like Java
> -
>
> Key: BEAM-7727
> URL: https://issues.apache.org/jira/browse/BEAM-7727
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>
> Currently Python RestrictionTracker manages synchronization, which is not 
> easy for sdk user to implemented. Python SDK should provide a same wrapper 
> like Java to help manage synchronization: 
> https://github.com/apache/beam/pull/6467



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-7727) Python RestrictionTracker should have a synchronization wrapper like Java

2019-07-11 Thread Boyuan Zhang (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyuan Zhang reassigned BEAM-7727:
--

Assignee: Boyuan Zhang

> Python RestrictionTracker should have a synchronization wrapper like Java
> -
>
> Key: BEAM-7727
> URL: https://issues.apache.org/jira/browse/BEAM-7727
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>
> Currently Python RestrictionTracker manages synchronization, which is not 
> easy for sdk user to implemented. Python SDK should provide a same wrapper 
> like Java to help manage synchronization: 
> https://github.com/apache/beam/pull/6467



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7727) Python RestrictionTracker should have a synchronization wrapper like Java

2019-07-11 Thread Boyuan Zhang (JIRA)

Boyuan Zhang created BEAM-7727:
--

 Summary: Python RestrictionTracker should have a synchronization 
wrapper like Java
 Key: BEAM-7727
 URL: https://issues.apache.org/jira/browse/BEAM-7727
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Boyuan Zhang


Currently Python RestrictionTracker manages synchronization, which is not easy 
for sdk user to implemented. Python SDK should provide a same wrapper like Java 
to help manage synchronization: https://github.com/apache/beam/pull/6467



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7726) [Go SDK] State Backed Iterables

2019-07-11 Thread Robert Burke (JIRA)

Robert Burke created BEAM-7726:
--

 Summary: [Go SDK] State Backed Iterables
 Key: BEAM-7726
 URL: https://issues.apache.org/jira/browse/BEAM-7726
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Affects Versions: Not applicable
Reporter: Robert Burke
Assignee: Robert Burke
 Fix For: Not applicable


The Go SDK should support the State backed iterables protocol per the proto.
[https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L644]

 

Primary case is for iterables after CoGBKs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-6825) Improve pipeline construction time error messages in Go SDK.

2019-07-11 Thread Daniel Oliveira (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-6825.
---
   Resolution: Fixed
Fix Version/s: Not applicable

Forgot to update this bug, but this effort has been done for about a month.

> Improve pipeline construction time error messages in Go SDK.
> 
>
> Key: BEAM-6825
> URL: https://issues.apache.org/jira/browse/BEAM-6825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Many error messages for common pipeline construction mistakes are unclear and 
> unhelpful. They need to be improved to provide more context, especially for 
> newer users. This bug tracks these error message improvements.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-7463) Bigquery IO ITs are flaky: incorrect checksum

2019-07-11 Thread Valentyn Tymofieiev (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-7463:
-

Assignee: (was: Juta Staes)

> Bigquery IO ITs are flaky: incorrect checksum
> -
>
> Key: BEAM-7463
> URL: https://issues.apache.org/jira/browse/BEAM-7463
> Project: Beam
>  Issue Type: Bug
>  Components: io-python-gcp
>Reporter: Valentyn Tymofieiev
>Priority: Major
>  Labels: currently-failing
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> 15:03:38 FAIL: test_big_query_new_types 
> (apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT)
> 15:03:38 
> --
> 15:03:38 Traceback (most recent call last):
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py",
>  line 211, in test_big_query_new_types
> 15:03:38 big_query_query_to_table_pipeline.run_bq_pipeline(options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/io/gcp/big_query_query_to_table_pipeline.py",
>  line 82, in run_bq_pipeline
> 15:03:38 result = p.run()
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 15:03:38 else test_runner_api))
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 15:03:38 self._options).run(False)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 15:03:38 return self.runner.run_pipeline(self, self._options)
> 15:03:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python3_Verify/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 51, in run_pipeline
> 15:03:38 hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 15:03:38 AssertionError: 
> 15:03:38 Expected: (Test pipeline expected terminated in state: DONE and 
> Expected checksum is 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214)
> 15:03:38  but: Expected checksum is 
> 24de460c4d344a4b77ccc4cc1acb7b7ffc11a214 Actual checksum is 
> da39a3ee5e6b4b0d3255bfef95601890afd80709
> {noformat}
> [~Juta] could this be caused by changes to Bigquery matcher? 
> https://github.com/apache/beam/pull/8621/files#diff-f1ec7e3a3e7e2e5082ddb7043954c108R134
>  
> cc: [~pabloem] [~chamikara] [~apilloud]
> A recent postcommit run has BQ failures in other tests as well: 
> https://builds.apache.org/job/beam_PostCommit_Python3_Verify/1000/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7484) Throughput collection in BigQuery performance tests

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7484?focusedWorklogId=275515&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275515
 ]

ASF GitHub Bot logged work on BEAM-7484:


Author: ASF GitHub Bot
Created on: 11/Jul/19 21:00
Start Date: 11/Jul/19 21:00
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8766: [BEAM-7484] 
Metrics collection in BigQuery perf tests
URL: https://github.com/apache/beam/pull/8766#discussion_r302695476
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigquery_read_perf_test.py
 ##
 @@ -126,9 +133,21 @@ def format_record(record):
 p.run().wait_until_finish()
 
   def test(self):
+def extract_values(row):
+  """Extracts value from a row."""
+  yield base64.b64decode(row.values()[0])
+
 self.result = (self.pipeline
| 'Read from BigQuery' >> Read(BigQuerySource(
dataset=self.input_dataset, table=self.input_table))
+   | 'Measure bytes' >> ParDo(MeasureBytes(
+   self.metrics_namespace, extract_values))
+   | 'Count messages' >> ParDo(CountMessages(
+   self.metrics_namespace))
+   | 'Measure time: Start' >> ParDo(MeasureTime(
+   self.metrics_namespace))
+   | 'Measure time: End' >> ParDo(MeasureTime(
 
 Review comment:
   I don't understand what the difference is between Start and End versions of 
MeasureTime. From what I understand,  you're measuring throughput, like bytes 
per second and messages per second. What time values do you extract from these 
metrics? Total running time? Momentary running time?
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275515)
Time Spent: 2h 10m  (was: 2h)

> Throughput collection in BigQuery performance tests
> ---
>
> Key: BEAM-7484
> URL: https://issues.apache.org/jira/browse/BEAM-7484
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The goal is to collect bytes/time and messages/time metrics in BQ read and 
> write tests in Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7424) Retry HTTP 429 errors from GCS w/ exponential backoff when reading data

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7424?focusedWorklogId=275505&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275505
 ]

ASF GitHub Bot logged work on BEAM-7424:


Author: ASF GitHub Bot
Created on: 11/Jul/19 20:18
Start Date: 11/Jul/19 20:18
Worklog Time Spent: 10m 
  Work Description: akedin commented on pull request #9014: [BEAM-7424] 
cherry-picking for 2.14.0
URL: https://github.com/apache/beam/pull/9014
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275505)
Time Spent: 5h 50m  (was: 5h 40m)

> Retry HTTP 429 errors from GCS w/ exponential backoff when reading data
> ---
>
> Key: BEAM-7424
> URL: https://issues.apache.org/jira/browse/BEAM-7424
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, io-python-gcp, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> This has to be done for both Java and Python SDKs.
> Seems like Java SDK already retries 429 errors w/o backoff (please verify): 
> [https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/RetryHttpRequestInitializer.java#L185]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-6611) A Python Sink for BigQuery with File Loads in Streaming

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-6611?focusedWorklogId=275504&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275504
 ]

ASF GitHub Bot logged work on BEAM-6611:


Author: ASF GitHub Bot
Created on: 11/Jul/19 20:16
Start Date: 11/Jul/19 20:16
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #8871: [BEAM-6611] BigQuery 
file loads in Streaming for Python SDK
URL: https://github.com/apache/beam/pull/8871#issuecomment-510637699
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275504)
Time Spent: 2h 20m  (was: 2h 10m)

> A Python Sink for BigQuery with File Loads in Streaming
> ---
>
> Key: BEAM-6611
> URL: https://issues.apache.org/jira/browse/BEAM-6611
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Tanay Tummalapalli
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> The Java SDK supports a bunch of methods for writing data into BigQuery, 
> while the Python SDK supports the following:
> - Streaming inserts for streaming pipelines [As seen in [bigquery.py and 
> BigQueryWriteFn|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L649-L813]]
> - File loads for batch pipelines [As implemented in [PR 
> 7655|https://github.com/apache/beam/pull/7655]]
> Qucik and dirty early design doc: https://s.apache.org/beam-bqfl-py-streaming
> The Java SDK also supports File Loads for Streaming pipelines [see BatchLoads 
> application|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1709-L1776].
> File loads have the advantage of being much cheaper than streaming inserts 
> (although they also are slower for the records to show up in the table).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread Andrew Pilloud (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud resolved BEAM-7621.
--
Resolution: Fixed

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread Andrew Pilloud (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud updated BEAM-7621:
-
Fix Version/s: 2.16.0

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275497&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275497
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:59
Start Date: 11/Jul/19 19:59
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #8930: [BEAM-7621] Null 
pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#issuecomment-510632045
 
 
   Thanks for fixing this bug!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275497)
Time Spent: 3.5h  (was: 3h 20m)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275494&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275494
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:54
Start Date: 11/Jul/19 19:54
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #8930: [BEAM-7621] 
Null pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275494)
Time Spent: 3h 20m  (was: 3h 10m)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7548) test_approximate_unique_global_by_error is flaky

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7548?focusedWorklogId=275483&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275483
 ]

ASF GitHub Bot logged work on BEAM-7548:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:32
Start Date: 11/Jul/19 19:32
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #8959: [BEAM-7548] 
Cherry pick - fix flaky tests for ApproximateUnique
URL: https://github.com/apache/beam/pull/8959#issuecomment-510623257
 
 
   Thank you Anton!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275483)
Time Spent: 9h 10m  (was: 9h)

> test_approximate_unique_global_by_error is flaky
> 
>
> Key: BEAM-7548
> URL: https://issues.apache.org/jira/browse/BEAM-7548
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> The error happened on Jenkins in Python 3.5 suite, which currently uses 
> Python 3.5.2 interpreter:
> {noformat}
> 11:57:47 
> ==
> 11:57:47 ERROR: test_approximate_unique_global_by_error 
> (apache_beam.transforms.stats_test.ApproximateUniqueTest)
> 11:57:47 
> --
> 11:57:47 Traceback (most recent call last):
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/transforms/stats_test.py",
>  line 236, in test_approximate_unique_global_by_error
> 11:57:47 pipeline.run()
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 11:57:47 else test_runner_api))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 11:57:47 self._options).run(False)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 11:57:47 return self.runner.run_pipeline(self, self._options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 128, in run_pipeline
> 11:57:47 return runner.run_pipeline(pipeline, options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 289, in run_pipeline
> 11:57:47 default_environment=self._default_environment))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 293, in run_via_runner_api
> 11:57:47 return self.run_stages(*self.create_stages(pipeline_proto))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 369, in run_stages
> 11:57:47 stage_context.safe_coders)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 531, in run_stage
> 11:57:47 data_input, data_output)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1235, in process_bundle
> 11:57:47 result_future = 
> self._controller.control_handler.push(process_bundle)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 851, in push
> 11:57:47 response = self.w

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275479&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275479
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:26
Start Date: 11/Jul/19 19:26
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #8930: [BEAM-7621] Null 
pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#issuecomment-510620634
 
 
   run precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275479)
Time Spent: 3h 10m  (was: 3h)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275477&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275477
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:25
Start Date: 11/Jul/19 19:25
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #8930: [BEAM-7621] Null 
pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#issuecomment-510621183
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275477)
Time Spent: 3h  (was: 2h 50m)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275476
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:25
Start Date: 11/Jul/19 19:25
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #8930: [BEAM-7621] Null 
pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#issuecomment-510621013
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275476)
Time Spent: 2h 50m  (was: 2h 40m)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType) {
>  
>   } else if (fromType.getTypeName().isCompositeType()
>   || (fromType.getTypeName().isCollectionType()
>   && 
> fromType.getCollectionElementType().getTypeName().isCompositeType())) {
>     field = Expressions.call(WrappedList.class, "of", field);  // 
> List.of() is passed with null values
>   }
>   ... 
>     }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275474
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:23
Start Date: 11/Jul/19 19:23
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #8930: [BEAM-7621] 
Null pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#discussion_r302705234
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
 ##
 @@ -386,24 +386,44 @@ public Expression field(BlockBuilder list, int index, 
Type storageType) {
   }
   Expression field = Expressions.call(expression, getter, 
Expressions.constant(index));
   if (fromType.getTypeName().isLogicalType()) {
-field = Expressions.call(field, "getMillis");
+Expression millisField = Expressions.call(field, "getMillis");
 String logicalId = fromType.getLogicalType().getIdentifier();
 if (logicalId.equals(TimeType.IDENTIFIER)) {
-  field = Expressions.convert_(field, int.class);
+  field =
+  Expressions.condition(
 
 Review comment:
   I'm suggesting you add one to this class.
   
   ```
   private static Expression nullOr(Expression field, Expression ifNotNull) {
   return Expressions.condition(
   Expressions.equal(field, Expressions.constant(null)),
   Expressions.constant(null),
   Expressions.box(ifNotNull));
   }
   ```
   
   Then this becomes
   ```
   nullOr(field, Expressions.convert_(millisField, int.class));
   ```
   and the same pattern can be applied below.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275474)
Time Spent: 2h 40m  (was: 2.5h)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: ja

[jira] [Work logged] (BEAM-7424) Retry HTTP 429 errors from GCS w/ exponential backoff when reading data

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7424?focusedWorklogId=275472&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275472
 ]

ASF GitHub Bot logged work on BEAM-7424:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:14
Start Date: 11/Jul/19 19:14
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #9014: [BEAM-7424] 
cherry-picking for 2.14.0
URL: https://github.com/apache/beam/pull/9014#issuecomment-510617428
 
 
   run python precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275472)
Time Spent: 5h 40m  (was: 5.5h)

> Retry HTTP 429 errors from GCS w/ exponential backoff when reading data
> ---
>
> Key: BEAM-7424
> URL: https://issues.apache.org/jira/browse/BEAM-7424
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, io-python-gcp, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> This has to be done for both Java and Python SDKs.
> Seems like Java SDK already retries 429 errors w/o backoff (please verify): 
> [https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/RetryHttpRequestInitializer.java#L185]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275469&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275469
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 11/Jul/19 19:08
Start Date: 11/Jul/19 19:08
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#issuecomment-510615471
 
 
   Thank Kenn! Will take a look soon.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275469)
Time Spent: 1h 20m  (was: 1h 10m)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-6611) A Python Sink for BigQuery with File Loads in Streaming

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-6611?focusedWorklogId=275458&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275458
 ]

ASF GitHub Bot logged work on BEAM-6611:


Author: ASF GitHub Bot
Created on: 11/Jul/19 18:34
Start Date: 11/Jul/19 18:34
Worklog Time Spent: 10m 
  Work Description: ttanay commented on issue #8871: [BEAM-6611] BigQuery 
file loads in Streaming for Python SDK
URL: https://github.com/apache/beam/pull/8871#issuecomment-510603552
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275458)
Time Spent: 2h 10m  (was: 2h)

> A Python Sink for BigQuery with File Loads in Streaming
> ---
>
> Key: BEAM-6611
> URL: https://issues.apache.org/jira/browse/BEAM-6611
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Tanay Tummalapalli
>Priority: Major
>  Labels: gsoc, gsoc2019, mentor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The Java SDK supports a bunch of methods for writing data into BigQuery, 
> while the Python SDK supports the following:
> - Streaming inserts for streaming pipelines [As seen in [bigquery.py and 
> BigQueryWriteFn|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py#L649-L813]]
> - File loads for batch pipelines [As implemented in [PR 
> 7655|https://github.com/apache/beam/pull/7655]]
> Qucik and dirty early design doc: https://s.apache.org/beam-bqfl-py-streaming
> The Java SDK also supports File Loads for Streaming pipelines [see BatchLoads 
> application|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1709-L1776].
> File loads have the advantage of being much cheaper than streaming inserts 
> (although they also are slower for the records to show up in the table).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7386) Add Utility BiTemporalStreamJoin

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7386?focusedWorklogId=275455&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275455
 ]

ASF GitHub Bot logged work on BEAM-7386:


Author: ASF GitHub Bot
Created on: 11/Jul/19 18:26
Start Date: 11/Jul/19 18:26
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9032: [BEAM-7386] 
Bi-Temporal Join
URL: https://github.com/apache/beam/pull/9032#issuecomment-510600426
 
 
   I believe you are looking for @amaliujia 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275455)
Time Spent: 1h 10m  (was: 1h)

> Add Utility BiTemporalStreamJoin
> 
>
> Key: BEAM-7386
> URL: https://issues.apache.org/jira/browse/BEAM-7386
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-ideas
>Affects Versions: 2.12.0
>Reporter: Reza ardeshir rokni
>Assignee: Reza ardeshir rokni
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Add utility class that enables a temporal join between two streams where 
> Stream A is matched to Stream B where
> A.timestamp = (max(b.timestamp) where b.timestamp <= a.timestamp) 
> This will use the following overall flow:
> KV(key, Timestamped) 
> | Window
> | GBK
> | Statefull DoFn
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7617) LTS Backport: urlopen calls could get stuck without a timeout

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7617?focusedWorklogId=275451&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275451
 ]

ASF GitHub Bot logged work on BEAM-7617:


Author: ASF GitHub Bot
Created on: 11/Jul/19 18:19
Start Date: 11/Jul/19 18:19
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #8927: [BEAM-7617] 
Cherry Pick #8925 to 2.7.1 release branch
URL: https://github.com/apache/beam/pull/8927#issuecomment-510598027
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275451)
Time Spent: 50m  (was: 40m)

> LTS Backport: urlopen calls could get stuck without a timeout
> -
>
> Key: BEAM-7617
> URL: https://issues.apache.org/jira/browse/BEAM-7617
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.7.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7484) Throughput collection in BigQuery performance tests

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7484?focusedWorklogId=275450&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275450
 ]

ASF GitHub Bot logged work on BEAM-7484:


Author: ASF GitHub Bot
Created on: 11/Jul/19 18:17
Start Date: 11/Jul/19 18:17
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8766: [BEAM-7484] Metrics 
collection in BigQuery perf tests
URL: https://github.com/apache/beam/pull/8766#issuecomment-510597113
 
 
   Yes, sorry for the delay
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275450)
Time Spent: 2h  (was: 1h 50m)

> Throughput collection in BigQuery performance tests
> ---
>
> Key: BEAM-7484
> URL: https://issues.apache.org/jira/browse/BEAM-7484
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The goal is to collect bytes/time and messages/time metrics in BQ read and 
> write tests in Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Closed] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri closed BEAM-7723.
---
   Resolution: Resolved
Fix Version/s: Not applicable

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
> Fix For: Not applicable
>
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7548) test_approximate_unique_global_by_error is flaky

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7548?focusedWorklogId=275445&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275445
 ]

ASF GitHub Bot logged work on BEAM-7548:


Author: ASF GitHub Bot
Created on: 11/Jul/19 18:09
Start Date: 11/Jul/19 18:09
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #8959: [BEAM-7548] Cherry 
pick - fix flaky tests for ApproximateUnique
URL: https://github.com/apache/beam/pull/8959#issuecomment-510593210
 
 
   Python precommit passed:
   
   - https://builds.apache.org/job/beam_PreCommit_Python_Phrase/640/
   - https://scans.gradle.com/s/4isyitp7hrkv4
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275445)
Time Spent: 9h  (was: 8h 50m)

> test_approximate_unique_global_by_error is flaky
> 
>
> Key: BEAM-7548
> URL: https://issues.apache.org/jira/browse/BEAM-7548
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> The error happened on Jenkins in Python 3.5 suite, which currently uses 
> Python 3.5.2 interpreter:
> {noformat}
> 11:57:47 
> ==
> 11:57:47 ERROR: test_approximate_unique_global_by_error 
> (apache_beam.transforms.stats_test.ApproximateUniqueTest)
> 11:57:47 
> --
> 11:57:47 Traceback (most recent call last):
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/transforms/stats_test.py",
>  line 236, in test_approximate_unique_global_by_error
> 11:57:47 pipeline.run()
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 11:57:47 else test_runner_api))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 11:57:47 self._options).run(False)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 11:57:47 return self.runner.run_pipeline(self, self._options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 128, in run_pipeline
> 11:57:47 return runner.run_pipeline(pipeline, options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 289, in run_pipeline
> 11:57:47 default_environment=self._default_environment))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 293, in run_via_runner_api
> 11:57:47 return self.run_stages(*self.create_stages(pipeline_proto))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 369, in run_stages
> 11:57:47 stage_context.safe_coders)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 531, in run_stage
> 11:57:47 data_input, data_output)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1235, in process_bundle
> 11:57:47 result_future = 
> self._controller.control_handler.push(process_bundle)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs

[jira] [Created] (BEAM-7725) migrate off of ghprb

2019-07-11 Thread Udi Meiri (JIRA)

Udi Meiri created BEAM-7725:
---

 Summary: migrate off of ghprb
 Key: BEAM-7725
 URL: https://issues.apache.org/jira/browse/BEAM-7725
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Udi Meiri


ghprb-plugin is going away and Beam needs to migrate off of it.

Some background in https://issues.apache.org/jira/browse/BEAM-7723 (see the 
wiki page linked there)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Comment Edited] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883231#comment-16883231
 ] 

Udi Meiri edited comment on BEAM-7723 at 7/11/19 6:08 PM:
--

Opened https://issues.apache.org/jira/browse/BEAM-7725 for migration.


was (Author: udim):
Opened https://issues.apache.org/jira/browse/BEAM-7723 for migration.

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7548) test_approximate_unique_global_by_error is flaky

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7548?focusedWorklogId=275444&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275444
 ]

ASF GitHub Bot logged work on BEAM-7548:


Author: ASF GitHub Bot
Created on: 11/Jul/19 18:07
Start Date: 11/Jul/19 18:07
Worklog Time Spent: 10m 
  Work Description: akedin commented on pull request #8959: [BEAM-7548] 
Cherry pick - fix flaky tests for ApproximateUnique
URL: https://github.com/apache/beam/pull/8959
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275444)
Time Spent: 8h 50m  (was: 8h 40m)

> test_approximate_unique_global_by_error is flaky
> 
>
> Key: BEAM-7548
> URL: https://issues.apache.org/jira/browse/BEAM-7548
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> The error happened on Jenkins in Python 3.5 suite, which currently uses 
> Python 3.5.2 interpreter:
> {noformat}
> 11:57:47 
> ==
> 11:57:47 ERROR: test_approximate_unique_global_by_error 
> (apache_beam.transforms.stats_test.ApproximateUniqueTest)
> 11:57:47 
> --
> 11:57:47 Traceback (most recent call last):
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/transforms/stats_test.py",
>  line 236, in test_approximate_unique_global_by_error
> 11:57:47 pipeline.run()
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/testing/test_pipeline.py",
>  line 107, in run
> 11:57:47 else test_runner_api))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 11:57:47 self._options).run(False)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 11:57:47 return self.runner.run_pipeline(self, self._options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 128, in run_pipeline
> 11:57:47 return runner.run_pipeline(pipeline, options)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 289, in run_pipeline
> 11:57:47 default_environment=self._default_environment))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 293, in run_via_runner_api
> 11:57:47 return self.run_stages(*self.create_stages(pipeline_proto))
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 369, in run_stages
> 11:57:47 stage_context.safe_coders)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 531, in run_stage
> 11:57:47 data_input, data_output)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1235, in process_bundle
> 11:57:47 result_future = 
> self._controller.control_handler.push(process_bundle)
> 11:57:47   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 851, in push
> 11:57:47 response = self.worker.do_instruction(request)
> 11

[jira] [Commented] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883231#comment-16883231
 ] 

Udi Meiri commented on BEAM-7723:
-

Opened https://issues.apache.org/jira/browse/BEAM-7723 for migration.

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7725) Jenkins config: migrate off of ghprb-plugin

2019-07-11 Thread Udi Meiri (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-7725:

Summary: Jenkins config: migrate off of ghprb-plugin  (was: migrate off of 
ghprb)

> Jenkins config: migrate off of ghprb-plugin
> ---
>
> Key: BEAM-7725
> URL: https://issues.apache.org/jira/browse/BEAM-7725
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Priority: Blocker
>
> ghprb-plugin is going away and Beam needs to migrate off of it.
> Some background in https://issues.apache.org/jira/browse/BEAM-7723 (see the 
> wiki page linked there)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-7723:
---

Assignee: Udi Meiri

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Comment Edited] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883218#comment-16883218
 ] 

Udi Meiri edited comment on BEAM-7723 at 7/11/19 5:49 PM:
--

Chatted on Slack, they did something to fix it.
https://the-asf.slack.com/archives/CBX4TSBQ8/p1562867180305200

Also they noted:
{code}
cool if you have any more issues, change the AdminList from 'asfbot' to 'asf-ci'
{code}


was (Author: udim):
Chatted on Slack, they did something to fix it.
https://the-asf.slack.com/archives/CBX4TSBQ8/p1562867180305200

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883218#comment-16883218
 ] 

Udi Meiri commented on BEAM-7723:
-

Chatted on Slack, they did something to fix it.
https://the-asf.slack.com/archives/CBX4TSBQ8/p1562867180305200

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7689) Temporary directory for WriteOperation may not be unique in FileBaseSink

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7689?focusedWorklogId=275434&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275434
 ]

ASF GitHub Bot logged work on BEAM-7689:


Author: ASF GitHub Bot
Created on: 11/Jul/19 17:40
Start Date: 11/Jul/19 17:40
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #9039: [BEAM-7689] 
cherry-picking for 2.14.0
URL: https://github.com/apache/beam/pull/9039#issuecomment-510583231
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275434)
Time Spent: 2h 20m  (was: 2h 10m)

> Temporary directory for WriteOperation may not be unique in FileBaseSink
> 
>
> Key: BEAM-7689
> URL: https://issues.apache.org/jira/browse/BEAM-7689
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.7.1, 2.14.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Temporary directory for WriteOperation in FileBasedSink is generated from a 
> second-granularity timestamp (-MM-dd_HH-mm-ss) and unique increasing 
> index. Such granularity is not enough to make temporary directories unique 
> between different jobs. When two jobs share the same temporary directory, 
> output file may not be produced in one job since the required temporary file 
> can be deleted from another job.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7616) urlopen calls could get stuck without a timeout

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7616?focusedWorklogId=275430&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275430
 ]

ASF GitHub Bot logged work on BEAM-7616:


Author: ASF GitHub Bot
Created on: 11/Jul/19 17:28
Start Date: 11/Jul/19 17:28
Worklog Time Spent: 10m 
  Work Description: akedin commented on pull request #8926: [BEAM-7616]  
Cherry Pick #8925 to release branch
URL: https://github.com/apache/beam/pull/8926
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275430)
Time Spent: 2h  (was: 1h 50m)

> urlopen calls could get stuck without a timeout
> ---
>
> Key: BEAM-7616
> URL: https://issues.apache.org/jira/browse/BEAM-7616
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7616) urlopen calls could get stuck without a timeout

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7616?focusedWorklogId=275429&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275429
 ]

ASF GitHub Bot logged work on BEAM-7616:


Author: ASF GitHub Bot
Created on: 11/Jul/19 17:28
Start Date: 11/Jul/19 17:28
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #8926: [BEAM-7616]  Cherry 
Pick #8925 to release branch
URL: https://github.com/apache/beam/pull/8926#issuecomment-510579110
 
 
   Python precommit succeeded:
   
   - https://builds.apache.org/job/beam_PreCommit_Python_Commit/7391/
   - https://scans.gradle.com/s/7pdbjhd6y4yd2
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275429)
Time Spent: 1h 50m  (was: 1h 40m)

> urlopen calls could get stuck without a timeout
> ---
>
> Key: BEAM-7616
> URL: https://issues.apache.org/jira/browse/BEAM-7616
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7724) Codegen should cast(null) to a type to match exact function signature

2019-07-11 Thread Rui Wang (JIRA)

Rui Wang created BEAM-7724:
--

 Summary: Codegen should cast(null) to a type to match exact 
function signature
 Key: BEAM-7724
 URL: https://issues.apache.org/jira/browse/BEAM-7724
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Rui Wang


If there are two function signatures for the same function name, when input 
parameter is null, Janino will throw exception due to vagueness:

A(String)
A(Integer)

Janino does not know how to match A(null).  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883183#comment-16883183
 ] 

Udi Meiri commented on BEAM-7723:
-

Looks like we rely on several environment variables from GHPRB:
ghprbPullId
ghprbPullAuthorLogin
ghprbSourceBranch

Wiki documentation that may be useful: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89065396

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)

2019-07-11 Thread Udi Meiri (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-7723:

Summary: github PR comment phrase triggering of jenkins jobs broken (ghprb 
plugin)  (was: github PR comment phrase triggering of jenkins jobs broken)

> github PR comment phrase triggering of jenkins jobs broken (ghprb plugin)
> -
>
> Key: BEAM-7723
> URL: https://issues.apache.org/jira/browse/BEAM-7723
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
>
> Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (BEAM-7685) Join Reordering

2019-07-11 Thread Rui Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-7685:
---
Issue Type: New Feature  (was: Bug)

> Join Reordering
> ---
>
> Key: BEAM-7685
> URL: https://issues.apache.org/jira/browse/BEAM-7685
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Alireza Samadianzakaria
>Assignee: Alireza Samadianzakaria
>Priority: Major
>
> Query Parser does not reorder joins based on their costs. The fix is simple 
> we need to include the rules that are related to reordering joins such as 
> JoinCommuteRule. However, reordering joins may produce plans that has Cross 
> Join or other not supported types of join. We should either rewrite those 
> rules to consider that or return infinite cost for those types of joins so 
> that they will not be selected



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (BEAM-7723) github PR comment phrase triggering of jenkins jobs broken

2019-07-11 Thread Udi Meiri (JIRA)

Udi Meiri created BEAM-7723:
---

 Summary: github PR comment phrase triggering of jenkins jobs broken
 Key: BEAM-7723
 URL: https://issues.apache.org/jira/browse/BEAM-7723
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Udi Meiri


Apache Infra team has removed ghprb-plugin from Jenkins.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7666) Pipe memory thrashing signal to Dataflow

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7666?focusedWorklogId=275406&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275406
 ]

ASF GitHub Bot logged work on BEAM-7666:


Author: ASF GitHub Bot
Created on: 11/Jul/19 16:48
Start Date: 11/Jul/19 16:48
Worklog Time Spent: 10m 
  Work Description: dustin12 commented on issue #8971: [BEAM-7666] Throttle 
piping
URL: https://github.com/apache/beam/pull/8971#issuecomment-510565665
 
 
   @pabloem ping now that everyone is back from the long weekend :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275406)
Time Spent: 10m
Remaining Estimate: 0h

> Pipe memory thrashing signal to Dataflow
> 
>
> Key: BEAM-7666
> URL: https://issues.apache.org/jira/browse/BEAM-7666
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For autoscaling we would like to know if the user worker is spending too much 
> time garbage collecting.  Pipe this signal through counters to DF.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (BEAM-7198) Rename ToStringCoder into ToBytesCoder

2019-07-11 Thread Kenneth Knowles (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-7198:
-

Assignee: Francesco Perera

> Rename ToStringCoder into ToBytesCoder
> --
>
> Key: BEAM-7198
> URL: https://issues.apache.org/jira/browse/BEAM-7198
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Francesco Perera
>Priority: Minor
>
> The name of ToStringCoder class [1] is confusing, since the output of 
> encode() on Python3 will be bytes. On Python 2 the output is also bytes, 
> since bytes and string are synonyms on Py2.
> ToBytesCoder would be a better name for this class. 
> Note that this class is not listed in coders that constitute Public APIs [2], 
> so we can treat this as internal change. As a courtesy to users  who happened 
> to reference a non-public coder in their pipelines we can keep the old class 
> name as an alias, e.g. ToStringCoder = ToBytesCoder to avoid friction, but 
> clean up Beam codeabase to use the new name.
> [1] 
> https://github.com/apache/beam/blob/ef4b2ef7e5fa2fb87e1491df82d2797947f51be9/sdks/python/apache_beam/coders/coders.py#L344
> [2] 
> https://github.com/apache/beam/blob/ef4b2ef7e5fa2fb87e1491df82d2797947f51be9/sdks/python/apache_beam/coders/coders.py#L20
> cc: [~yoshiki.obata] [~chamikara]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275391
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 16:11
Start Date: 11/Jul/19 16:11
Worklog Time Spent: 10m 
  Work Description: bmv126 commented on pull request #8930: [BEAM-7621] 
Null pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#discussion_r302628537
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
 ##
 @@ -386,24 +386,44 @@ public Expression field(BlockBuilder list, int index, 
Type storageType) {
   }
   Expression field = Expressions.call(expression, getter, 
Expressions.constant(index));
   if (fromType.getTypeName().isLogicalType()) {
-field = Expressions.call(field, "getMillis");
+Expression millisField = Expressions.call(field, "getMillis");
 String logicalId = fromType.getLogicalType().getIdentifier();
 if (logicalId.equals(TimeType.IDENTIFIER)) {
-  field = Expressions.convert_(field, int.class);
+  field =
+  Expressions.condition(
 
 Review comment:
   Are you suggesting the helper function can be added in calcite git ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275391)
Time Spent: 2.5h  (was: 2h 20m)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$CalcFn.processElement(BeamCalcRel.java:253)
> Caused by: java.lang.NullPointerException
>     at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel$WrappedList.of(BeamCalcRel.java:459)
>     at SC.eval0(Unknown Source)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  
> In BeamCalcRel.class *field* method, null values are not handled for 
> composite types.
>     public Expression field(BlockBuilder list, int index, Type storageType

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275386
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 16:07
Start Date: 11/Jul/19 16:07
Worklog Time Spent: 10m 
  Work Description: bmv126 commented on pull request #8930: [BEAM-7621] 
Null pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#discussion_r302626701
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
 ##
 @@ -386,24 +386,44 @@ public Expression field(BlockBuilder list, int index, 
Type storageType) {
   }
   Expression field = Expressions.call(expression, getter, 
Expressions.constant(index));
   if (fromType.getTypeName().isLogicalType()) {
-field = Expressions.call(field, "getMillis");
+Expression millisField = Expressions.call(field, "getMillis");
 String logicalId = fromType.getLogicalType().getIdentifier();
 if (logicalId.equals(TimeType.IDENTIFIER)) {
-  field = Expressions.convert_(field, int.class);
+  field =
+  Expressions.condition(
+  Expressions.equal(field, Expressions.constant(null)),
+  Expressions.constant(null),
+  Expressions.call(
 
 Review comment:
   Without adding the autoboxing part, the expression was getting generated as 
below:  
   current.getDateTime(1) == null ? (Long) null : 
current.getDateTime(1).getMillis()
 
   If "current.getDateTime(1) == null" evaluates to true, above expression was 
causing Null pointer exception as "null" is getting autounboxed to long. 
   So I had to add the autoboxing part. 
   Now I have changed it to Expressions.box as per your suggestion.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275386)
Time Spent: 2h 10m  (was: 2h)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2) == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(inp2_,
>  2), 0, "string_field")).build());
> }
>

[jira] [Work logged] (BEAM-7621) Selecting null row field's causes Null pointer Exception with BeamSql

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7621?focusedWorklogId=275387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275387
 ]

ASF GitHub Bot logged work on BEAM-7621:


Author: ASF GitHub Bot
Created on: 11/Jul/19 16:07
Start Date: 11/Jul/19 16:07
Worklog Time Spent: 10m 
  Work Description: bmv126 commented on pull request #8930: [BEAM-7621] 
Null pointer exception when accessing null row fields in BeamSql
URL: https://github.com/apache/beam/pull/8930#discussion_r302626780
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamCalcRel.java
 ##
 @@ -386,24 +386,44 @@ public Expression field(BlockBuilder list, int index, 
Type storageType) {
   }
   Expression field = Expressions.call(expression, getter, 
Expressions.constant(index));
   if (fromType.getTypeName().isLogicalType()) {
-field = Expressions.call(field, "getMillis");
+Expression millisField = Expressions.call(field, "getMillis");
 String logicalId = fromType.getLogicalType().getIdentifier();
 if (logicalId.equals(TimeType.IDENTIFIER)) {
-  field = Expressions.convert_(field, int.class);
+  field =
+  Expressions.condition(
+  Expressions.equal(field, Expressions.constant(null)),
+  Expressions.constant(null),
+  Expressions.call(
+  Integer.class, "valueOf", 
Expressions.convert_(millisField, int.class)));
 } else if (logicalId.equals(DateType.IDENTIFIER)) {
   field =
-  Expressions.convert_(
-  Expressions.modulo(field, 
Expressions.constant(MILLIS_PER_DAY)), int.class);
+  Expressions.condition(
+  Expressions.equal(field, Expressions.constant(null)),
+  Expressions.constant(null),
+  Expressions.call(
+  Integer.class,
+  "valueOf",
+  Expressions.convert_(
+  Expressions.modulo(millisField, 
Expressions.constant(MILLIS_PER_DAY)),
 
 Review comment:
   I have fixed it now, by making it a division operation.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275387)
Time Spent: 2h 20m  (was: 2h 10m)

> Selecting null row field's causes Null pointer Exception with BeamSql
> -
>
> Key: BEAM-7621
> URL: https://issues.apache.org/jira/browse/BEAM-7621
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.14.0
>Reporter: Vishwas
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> I have below schema of the row:
>   private static final Schema innerRowWithArraySchema =
>   Schema.builder()
>   .addStringField("string_field")
>   .addArrayField("array_field", FieldType.INT64)
>   .build();
> private static final Schema nullableNestedRowWithArraySchema =
>   Schema.builder()
>   
> .addNullableField("field1",FieldType.row(innerRowWithArraySchema))
>   .addNullableField("field2", 
> FieldType.array(FieldType.row(innerRowWithArraySchema)))
>   .build();
>  
> *// Create a row with null values*
> Row nullRow = Row.nullRow(nullableNestedRowWithArraySchema);
>  
> Now when we try to select nested row field's NPE is thrown:
> .apply(SqlTransform.query("select PCOLLECTION.field1.string_field as 
> row_string_field, PCOLLECTION.field2[2].string_field as array_string_field 
> from PCOLLECTION"));
>  
> Below is the exception thrown:
> Caused by: java.lang.RuntimeException: CalcFn failed to evaluate: {
>   final org.apache.beam.sdk.values.Row current = 
> (org.apache.beam.sdk.values.Row) c.element();
>   *final java.util.List inp1_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getRow(1));*
>   *final java.util.List inp2_ = 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel.WrappedList.of(current.getArray(2));*
>   
> c.output(org.apache.beam.sdk.values.Row.withSchema(outputSchema).addValue(inp1_
>  == null ? (String) null : (String) 
> org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.structAccess(inp1_,
>  0, 
> "string_field")).addValue(org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.runtime.SqlFunctions.arrayI

[jira] [Work logged] (BEAM-7616) urlopen calls could get stuck without a timeout

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7616?focusedWorklogId=275381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275381
 ]

ASF GitHub Bot logged work on BEAM-7616:


Author: ASF GitHub Bot
Created on: 11/Jul/19 15:55
Start Date: 11/Jul/19 15:55
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #8926: [BEAM-7616]  Cherry 
Pick #8925 to release branch
URL: https://github.com/apache/beam/pull/8926#issuecomment-510545445
 
 
   running the precommit manually: 
https://builds.apache.org/job/beam_PreCommit_Python_Commit/7389/
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275381)
Time Spent: 1h 40m  (was: 1.5h)

> urlopen calls could get stuck without a timeout
> ---
>
> Key: BEAM-7616
> URL: https://issues.apache.org/jira/browse/BEAM-7616
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7603) Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7603?focusedWorklogId=275379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275379
 ]

ASF GitHub Bot logged work on BEAM-7603:


Author: ASF GitHub Bot
Created on: 11/Jul/19 15:49
Start Date: 11/Jul/19 15:49
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #8908: [BEAM-7603] Support 
for ValueProvider-given GCS Location for WriteToBigQuery w File Loads
URL: https://github.com/apache/beam/pull/8908#issuecomment-510542915
 
 
   Thank you Anton!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275379)
Time Spent: 4h 50m  (was: 4h 40m)

> Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads
> -
>
> Key: BEAM-7603
> URL: https://issues.apache.org/jira/browse/BEAM-7603
> Project: Beam
>  Issue Type: Improvement
>  Components: io-python-gcp
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7603) Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7603?focusedWorklogId=275378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275378
 ]

ASF GitHub Bot logged work on BEAM-7603:


Author: ASF GitHub Bot
Created on: 11/Jul/19 15:47
Start Date: 11/Jul/19 15:47
Worklog Time Spent: 10m 
  Work Description: akedin commented on pull request #8908: [BEAM-7603] 
Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads
URL: https://github.com/apache/beam/pull/8908
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275378)
Time Spent: 4h 40m  (was: 4.5h)

> Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads
> -
>
> Key: BEAM-7603
> URL: https://issues.apache.org/jira/browse/BEAM-7603
> Project: Beam
>  Issue Type: Improvement
>  Components: io-python-gcp
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7603) Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7603?focusedWorklogId=275376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275376
 ]

ASF GitHub Bot logged work on BEAM-7603:


Author: ASF GitHub Bot
Created on: 11/Jul/19 15:45
Start Date: 11/Jul/19 15:45
Worklog Time Spent: 10m 
  Work Description: akedin commented on issue #8908: [BEAM-7603] Support 
for ValueProvider-given GCS Location for WriteToBigQuery w File Loads
URL: https://github.com/apache/beam/pull/8908#issuecomment-510541222
 
 
   Re-ran the python precommit manually, all seems green: 
https://scans.gradle.com/s/3s7bzzralmq6w , merging
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 275376)
Time Spent: 4.5h  (was: 4h 20m)

> Support for ValueProvider-given GCS Location for WriteToBigQuery w File Loads
> -
>
> Key: BEAM-7603
> URL: https://issues.apache.org/jira/browse/BEAM-7603
> Project: Beam
>  Issue Type: Improvement
>  Components: io-python-gcp
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Work logged] (BEAM-7722) Simplify running of Beam Python on Flink

2019-07-11 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/BEAM-7722?focusedWorklogId=275370&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-275370
 ]

ASF GitHub Bot logged work on BEAM-7722:


Author: ASF GitHub Bot
Created on: 11/Jul/19 15:22
Start Date: 11/Jul/19 15:22
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9043: [BEAM-7722] 
Add a Python FlinkRunner that fetches and uses released artifacts.
URL: https://github.com/apache/beam/pull/9043
 
 
   Also refactored the job_server module to be more re-usable and cleaned up 
the logic in PortableRunner. 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](http

[jira] [Created] (BEAM-7722) Simplify running of Beam Python on Flink

2019-07-11 Thread Robert Bradshaw (JIRA)

Robert Bradshaw created BEAM-7722:
-

 Summary: Simplify running of Beam Python on Flink
 Key: BEAM-7722
 URL: https://issues.apache.org/jira/browse/BEAM-7722
 Project: Beam
  Issue Type: Test
  Components: sdk-py-core
Reporter: Robert Bradshaw


Currently this requires building and running several processes. We should be 
able to automate most of this away. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (BEAM-7718) PubsubIO to use gRPC API instead of JSON REST API

2019-07-11 Thread Steve Niemitz (JIRA)



[ 
https://issues.apache.org/jira/browse/BEAM-7718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883084#comment-16883084
 ] 

Steve Niemitz commented on BEAM-7718:
-

Additionally, the gRPC implementation seems efficient enough that it will very 
quickly hit the pubsub subscriber quota with any non-trivial number of workers. 
 I'm not really sure why, using the windmill-native version I can easily read 
from this subscription at a reasonable rate with plenty of quota left.

I wasn't able to do the same with the JSON version, but I think that's just 
because it isn't efficient enough to be able to get enough throughput on a 
worker.

> PubsubIO to use gRPC API instead of JSON REST API
> -
>
> Key: BEAM-7718
> URL: https://issues.apache.org/jira/browse/BEAM-7718
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Steve Niemitz
>Priority: Major
>
> {quote}The default implementation uses the HTTP REST API, which seems to be 
> much less performant than the gRPC implementation.  Is there a reason that 
> the gRPC implementation is essentially unavailable from the public API?  
> PubsubIO.Read.withClientFactory is package private.  I worked around this by 
> making it public and rebuilding{quote}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

1 2 >

1 - 100 of 121 matches

Mail list logo