[jira] [Assigned] (BEAM-5317) Finish Python 3 porting for options module

2018-09-13 Thread Manu Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manu Zhang reassigned BEAM-5317:


Assignee: Manu Zhang

> Finish Python 3 porting for options module
> --
>
> Key: BEAM-5317
> URL: https://issues.apache.org/jira/browse/BEAM-5317
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Manu Zhang
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5317) Finish Python 3 porting for options module

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5317?focusedWorklogId=144180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144180
 ]

ASF GitHub Bot logged work on BEAM-5317:


Author: ASF GitHub Bot
Created on: 14/Sep/18 03:44
Start Date: 14/Sep/18 03:44
Worklog Time Spent: 10m 
  Work Description: manuzhang commented on issue #6397: [BEAM-5317] Finish 
Python3 porting for options module
URL: https://github.com/apache/beam/pull/6397#issuecomment-421222033
 
 
   R: @RobbeSneyders @aaltay 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144180)
Time Spent: 20m  (was: 10m)

> Finish Python 3 porting for options module
> --
>
> Key: BEAM-5317
> URL: https://issues.apache.org/jira/browse/BEAM-5317
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5317) Finish Python 3 porting for options module

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5317?focusedWorklogId=144178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144178
 ]

ASF GitHub Bot logged work on BEAM-5317:


Author: ASF GitHub Bot
Created on: 14/Sep/18 03:40
Start Date: 14/Sep/18 03:40
Worklog Time Spent: 10m 
  Work Description: manuzhang opened a new pull request #6397: [BEAM-5317] 
Finish Python3 porting for options module
URL: https://github.com/apache/beam/pull/6397
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144178)
Time Spent: 10m
Remaining Estimate: 0h

> Finish Python 3 porting for options module
> --
>
> Key: BEAM-5317
> URL: https://issues.apache.org/jira/browse/BEAM-5317
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?focusedWorklogId=144166=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144166
 ]

ASF GitHub Bot logged work on BEAM-5288:


Author: ASF GitHub Bot
Created on: 14/Sep/18 02:16
Start Date: 14/Sep/18 02:16
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6373: [BEAM-5288] [NOT READY 
FOR MERGE]Enhance Environment proto to support different types of environments
URL: https://github.com/apache/beam/pull/6373#issuecomment-421209186
 
 
   Once the PR is ready, please squash (except for the first commit authored by 
@mxm).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144166)
Time Spent: 2h  (was: 1h 50m)

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?focusedWorklogId=144165=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144165
 ]

ASF GitHub Bot logged work on BEAM-5288:


Author: ASF GitHub Bot
Created on: 14/Sep/18 02:13
Start Date: 14/Sep/18 02:13
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6373: 
[BEAM-5288] [NOT READY FOR MERGE]Enhance Environment proto to support different 
types of environments
URL: https://github.com/apache/beam/pull/6373#discussion_r217584721
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -963,16 +963,39 @@ message SideInput {
   SdkFunctionSpec window_mapping_fn = 3;
 }
 
-// An environment for executing UDFs. Generally an SDK container URL, but
-// there can be many for a single SDK, for example to provide dependency
-// isolation.
+// An environment for executing UDFs. By default, an SDK container URL, but
+// can also be a process forked by a command, or an externally managed process.
 message Environment {
 
-  // (Required) The URL of a container
-  //
-  // TODO: reconcile with Fn API's DockerContainer structure by
-  // adding adequate metadata to know how to interpret the container
-  string url = 1;
+  // (Required) The URN of the payload
+  string urn = 1;
+
+  // (Optional) The data specifying any parameters to the URN. If
+  // the URN does not require any arguments, this may be omitted.
+  bytes payload = 2;
+}
+
+message StandardEnvironments {
+  enum Environments {
+DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
+
+PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
+
+EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
+  }
 
 Review comment:
   @angoenka this remains to be addressed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144165)
Time Spent: 1h 50m  (was: 1h 40m)

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?focusedWorklogId=144164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144164
 ]

ASF GitHub Bot logged work on BEAM-5288:


Author: ASF GitHub Bot
Created on: 14/Sep/18 02:11
Start Date: 14/Sep/18 02:11
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6373: [BEAM-5288] [NOT READY 
FOR MERGE]Enhance Environment proto to support different types of environments
URL: https://github.com/apache/beam/pull/6373#issuecomment-421208473
 
 
   @angoenka backward compatibility with what? (We don't need to be concerned 
about external dependencies (yet), since we are just about to reach MVP with 
next release.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144164)
Time Spent: 1h 40m  (was: 1.5h)

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?focusedWorklogId=144163=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144163
 ]

ASF GitHub Bot logged work on BEAM-5288:


Author: ASF GitHub Bot
Created on: 14/Sep/18 01:55
Start Date: 14/Sep/18 01:55
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6373: [BEAM-5288] [NOT 
READY FOR MERGE]Enhance Environment proto to support different types of 
environments
URL: https://github.com/apache/beam/pull/6373#issuecomment-421205967
 
 
   @mxm I am planning to keep the URL support in this PR for backward 
compatibility. We can revisit and deprecate it in subsequent PRs


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144163)
Time Spent: 1.5h  (was: 1h 20m)

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-13 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614240#comment-16614240
 ] 

Ankur Goenka commented on BEAM-5288:


We are thinking of removing the args as of now to avoid any confusion and 
potential name collision. All the relevant information can be conveyed to the 
SDKHarness using the Environment.

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-13 Thread Henning Rohde (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614237#comment-16614237
 ] 

Henning Rohde commented on BEAM-5288:
-

If we're reusing the container contract (which I think is reasonable), then 
these extra proto args would have to come either before or after the 
runner-provided args. Having just the environment avoids that potential 
semantic confusion. I was originally thinking that the program would be 
executed in a (OS dependent) '/bin/bash -c ""' fashion and defer any 
escaping to the sdk, but I have no problem with more explicit args as long as 
the semantics is clear.

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment.
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message StandardEnvironments {
>enum Environments {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>map env = 4; // environment variables
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-13 Thread Nathan Fisher (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614225#comment-16614225
 ] 

Nathan Fisher edited comment on BEAM-5378 at 9/14/18 1:28 AM:
--

We use GCP extensively at work I would love for the direct runner to work with 
GCP resources where appropriate configuration has been supplied through the use 
of env vars such as GOOGLE_APPLICATION_CREDENTIALS which the GCP libraries 
auto-detect. I've only taken a glance at the differences between the Go SDK's 
dataflow, direct, and universal runner.

Am I correct in the following understanding?

1. The universal local runner is a reference implementation and is limited in 
it's functionality.
 2. The dataflow and universal runner both use graphx.Marshal which is related 
to the "portability framework".

If the universal local runner has enough capability could it be used as a 
"direct" runner allowing local development with GCP interactions? It seems if 
it were possible to get the above examples running on the ULR via Go the direct 
runner could effectively be deprecated?


was (Author: nfis...@junctionbox.ca):
We use GCP extensively at work I would love for the direct runner to work with 
GCP resources where appropriate configuration has been supplied through the use 
of env vars such as GOOGLE_APPLICATION_CREDENTIALS which the GCP libraries 
auto-detect. I've only taken a glance at the differences between the Go SDK's 
dataflow, direct, and universal runner.

Am I correct in the following understanding?

1. The universal local runner is a reference implementation and is limited in 
it's functionality.
2. The dataflow and universal runner both use graphx.Marshal which is related 
to the "portability framework".

If the universal local runner has enough capability could it be used as a 
"direct" runner allowing local development with GCP interactions?

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I've been spending a day or so running through the example available for the 
> Go SDK in order to see what works and on what runner (direct, dataflow), and 
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache 
> beam and dataflow it would be a tremendous value to have all examples running 
> because many of them have legitimate use-cases behind them. 
> {code:java}
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> ├── contains
> │   └── contains.go
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> │   ├── filter
> │   │   └── filter.go
> │   ├── join
> │   │   └── join.go
> │   ├── max
> │   │   └── max.go
> │   └── tornadoes
> │   └── tornadoes.go
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> ├── forest
> │   └── forest.go
> ├── grades
> │   └── grades.go
> ├── minimal_wordcount
> │   └── minimal_wordcount.go
> ├── multiout
> │   └── multiout.go
> ├── pingpong
> │   └── pingpong.go
> ├── streaming_wordcap
> │   └── wordcap.go
> ├── windowed_wordcount
> │   └── windowed_wordcount.go
> ├── wordcap
> │   └── wordcap.go
> ├── wordcount
> │   └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not 
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not 
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it 
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files 
> from gs://
> But when running this it yields a error from the top.Largest func (discussed 
> in another issue that top.Largest needs to have a serializeable combinator / 
> accumulator)
> ➜  autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019 
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── 

[jira] [Commented] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

2018-09-13 Thread Scott Jungwirth (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614227#comment-16614227
 ] 

Scott Jungwirth commented on BEAM-3106:
---

It looks like this particular issue (bigquery) will be fixed in v2.7.0 
https://github.com/apache/beam/commit/fba5e89820b9cab3fa63502030fd465aecf60556#diff-e9d0ab71f74dc10309a29b697ee99330

> Consider not pinning all python dependencies, or moving them to 
> requirements.txt
> 
>
> Key: BEAM-3106
> URL: https://issues.apache.org/jira/browse/BEAM-3106
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Affects Versions: 2.1.0
> Environment: python
>Reporter: Maximilian Roos
>Priority: Major
>
> Currently all python dependencies are [pinned or 
> capped|https://github.com/apache/beam/blob/master/sdks/python/setup.py#L97]
> While there's a good argument for supplying a `requirements.txt` with well 
> tested dependencies, having them specified in `setup.py` forces them to an 
> exact state on each install of Beam. This makes using Beam in any environment 
> with other libraries nigh on impossible. 
> This is particularly severe for the `gcp` dependencies, where we have 
> libraries that won't work with an older version (but Beam _does_ work with an 
> newer version). We have to do a bunch of gymnastics to get the correct 
> versions installed because of this. Unfortunately, airflow repeats this 
> practice and conflicts on a number of dependencies, adding further 
> complication (but, again there is no real conflict).
> I haven't seen this practice outside of the Apache & Google ecosystem - for 
> example no libraries in numerical python do this. Here's a [discussion on 
> SO|https://stackoverflow.com/questions/28509481/should-i-pin-my-python-dependencies-versions]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-13 Thread Nathan Fisher (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614225#comment-16614225
 ] 

Nathan Fisher commented on BEAM-5378:
-

We use GCP extensively at work I would love for the direct runner to work with 
GCP resources where appropriate configuration has been supplied through the use 
of env vars such as GOOGLE_APPLICATION_CREDENTIALS which the GCP libraries 
auto-detect. I've only taken a glance at the differences between the Go SDK's 
dataflow, direct, and universal runner.

Am I correct in the following understanding?

1. The universal local runner is a reference implementation and is limited in 
it's functionality.
2. The dataflow and universal runner both use graphx.Marshal which is related 
to the "portability framework".

If the universal local runner has enough capability could it be used as a 
"direct" runner allowing local development with GCP interactions?

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I've been spending a day or so running through the example available for the 
> Go SDK in order to see what works and on what runner (direct, dataflow), and 
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache 
> beam and dataflow it would be a tremendous value to have all examples running 
> because many of them have legitimate use-cases behind them. 
> {code:java}
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> ├── contains
> │   └── contains.go
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> │   ├── filter
> │   │   └── filter.go
> │   ├── join
> │   │   └── join.go
> │   ├── max
> │   │   └── max.go
> │   └── tornadoes
> │   └── tornadoes.go
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> ├── forest
> │   └── forest.go
> ├── grades
> │   └── grades.go
> ├── minimal_wordcount
> │   └── minimal_wordcount.go
> ├── multiout
> │   └── multiout.go
> ├── pingpong
> │   └── pingpong.go
> ├── streaming_wordcap
> │   └── wordcap.go
> ├── windowed_wordcount
> │   └── windowed_wordcount.go
> ├── wordcap
> │   └── wordcap.go
> ├── wordcount
> │   └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not 
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not 
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it 
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files 
> from gs://
> But when running this it yields a error from the top.Largest func (discussed 
> in another issue that top.Largest needs to have a serializeable combinator / 
> accumulator)
> ➜  autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019 
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── contains
> │   └── contains.go
> Fails when running debug.Head for some mysterious reason, might have to do 
> with the param passing into the x,y iterator. Frankly I dont know and could 
> not figure.
> But removing the debug.Head call everything works as expected and succeeds.
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> Fails because of extractFn which is a struct is not registered through the 
> beam.RegisterType (is this a must or not?)
> It works as a work around at least
> ➜  combine git:(master) ✗ ./combine 
> --output=fair-app-213019:combineoutput.test --project=fair-app-213019 
> --runner=dataflow --staging_location=gs://203019-staging/ 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:40:50 Running combine
> panic: Failed to serialize 3: ParDo [In(Main): main.WordRow <- {2: 
> main.WordRow/main.WordRow[json] GLO}] -> [Out: KV -> {3: 
> KV/KV GLO}]: encode: bad userfn: 

[jira] [Comment Edited] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

2018-09-13 Thread Scott Jungwirth (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614227#comment-16614227
 ] 

Scott Jungwirth edited comment on BEAM-3106 at 9/14/18 1:33 AM:


It looks like this particular issue (bigquery) will be fixed in v2.7.0 
[https://github.com/apache/beam/commit/fba5e89820b9cab3fa63502030fd465aecf60556#diff-e9d0ab71f74dc10309a29b697ee99330]

 

[edit] never mind, that commit was reverted: 
https://github.com/apache/beam/commit/a638e1dcdc2b58ed0da1bd467064068e9d892e13


was (Author: sjungwirth):
It looks like this particular issue (bigquery) will be fixed in v2.7.0 
https://github.com/apache/beam/commit/fba5e89820b9cab3fa63502030fd465aecf60556#diff-e9d0ab71f74dc10309a29b697ee99330

> Consider not pinning all python dependencies, or moving them to 
> requirements.txt
> 
>
> Key: BEAM-3106
> URL: https://issues.apache.org/jira/browse/BEAM-3106
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Affects Versions: 2.1.0
> Environment: python
>Reporter: Maximilian Roos
>Priority: Major
>
> Currently all python dependencies are [pinned or 
> capped|https://github.com/apache/beam/blob/master/sdks/python/setup.py#L97]
> While there's a good argument for supplying a `requirements.txt` with well 
> tested dependencies, having them specified in `setup.py` forces them to an 
> exact state on each install of Beam. This makes using Beam in any environment 
> with other libraries nigh on impossible. 
> This is particularly severe for the `gcp` dependencies, where we have 
> libraries that won't work with an older version (but Beam _does_ work with an 
> newer version). We have to do a bunch of gymnastics to get the correct 
> versions installed because of this. Unfortunately, airflow repeats this 
> practice and conflicts on a number of dependencies, adding further 
> complication (but, again there is no real conflict).
> I haven't seen this practice outside of the Apache & Google ecosystem - for 
> example no libraries in numerical python do this. Here's a [discussion on 
> SO|https://stackoverflow.com/questions/28509481/should-i-pin-my-python-dependencies-versions]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1459

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-4684] Add integration test for support of @RequiresStableInput

--
[...truncated 23.32 MB...]
... 21 more
Caused by: io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Value must 
not be NULL in table users.
at io.grpc.Status.asRuntimeException(Status.java:526)
at 
io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:468)
at 
io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at 
io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at 
io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at 
com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
at 
io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at 
io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at 
io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at 
com.google.cloud.spanner.spi.v1.WatchdogInterceptor$MonitoredCall$1.onClose(WatchdogInterceptor.java:190)
at 
io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at 
io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at 
io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at 
io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684)
at 
io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at 
io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at 
io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at 
io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:403)
at 
io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
at 
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546)
at 
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467)
at 
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
... 3 more

Sep 14, 2018 1:11:16 AM 
org.apache.beam.sdk.io.gcp.spanner.SpannerIO$WriteToSpannerFn processElement
WARNING: Failed to submit the mutation group
com.google.cloud.spanner.SpannerException: FAILED_PRECONDITION: 
io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Value must not be NULL in 
table users.
at 
com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:119)
at 
com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:43)
at 
com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:80)
at 
com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.get(GrpcSpannerRpc.java:456)
at 
com.google.cloud.spanner.spi.v1.GrpcSpannerRpc.commit(GrpcSpannerRpc.java:404)
at 
com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:797)
at 
com.google.cloud.spanner.SpannerImpl$SessionImpl$2.call(SpannerImpl.java:794)
at 
com.google.cloud.spanner.SpannerImpl.runWithRetries(SpannerImpl.java:227)
at 
com.google.cloud.spanner.SpannerImpl$SessionImpl.writeAtLeastOnce(SpannerImpl.java:793)
at 
com.google.cloud.spanner.SessionPool$PooledSession.writeAtLeastOnce(SessionPool.java:319)
at 
com.google.cloud.spanner.DatabaseClientImpl.writeAtLeastOnce(DatabaseClientImpl.java:60)
at 
org.apache.beam.sdk.io.gcp.spanner.SpannerIO$WriteToSpannerFn.processElement(SpannerIO.java:1103)
at 
org.apache.beam.sdk.io.gcp.spanner.SpannerIO$WriteToSpannerFn$DoFnInvoker.invokeProcessElement(Unknown
 Source)
at 
org.apache.beam.repackaged.beam_runners_direct_java.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:275)
at 
org.apache.beam.repackaged.beam_runners_direct_java.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:240)
at 

Build failed in Jenkins: beam_PostCommit_Python_PVR_Flink_Gradle #7

2018-09-13 Thread Apache Jenkins Server
See 


--
[...truncated 6.28 MB...]
[grpc-default-executor-0] INFO sdk_worker.__init__ - Creating insecure control 
channel.
[grpc-default-executor-0] INFO sdk_worker.__init__ - Control channel 
established.
[grpc-default-executor-0] INFO sdk_worker.__init__ - Initializing SDKHarness 
with 12 workers.
[grpc-default-executor-0] INFO 
org.apache.beam.runners.fnexecution.control.FnApiControlClientPoolService - 
Beam Fn Control client connected with id 1
[grpc-default-executor-0] INFO sdk_worker.run - Got work 1
[grpc-default-executor-1] INFO sdk_worker.run - Got work 6
[grpc-default-executor-1] INFO sdk_worker.run - Got work 5
[grpc-default-executor-1] INFO sdk_worker.run - Got work 3
[grpc-default-executor-1] INFO sdk_worker.run - Got work 2
[grpc-default-executor-1] INFO sdk_worker.run - Got work 4
[grpc-default-executor-1] INFO sdk_worker.run - Got work 7
[grpc-default-executor-1] INFO sdk_worker.create_state_handler - Creating 
channel for localhost:33437
[grpc-default-executor-1] INFO sdk_worker.run - Got work 8
[grpc-default-executor-1] INFO 
org.apache.beam.runners.fnexecution.data.GrpcDataService - Beam Fn Data client 
connected.
[grpc-default-executor-1] INFO data_plane.create_data_channel - Creating 
channel for localhost:43055
[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4675898fd7374f47fe75dbfcbcdee6fe) switched from 
RUNNING to FINISHED.
[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Freeing task resources for Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4675898fd7374f47fe75dbfcbcdee6fe).
[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Ensuring all FileSystem streams are closed for task Source: Collection Source 
-> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4675898fd7374f47fe75dbfcbcdee6fe) [FINISHED]
[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[flink-akka.actor.default-dispatcher-5] INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor - Un-registering task and 
sending final execution state FINISHED to JobManager for task Source: 
Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem 4675898fd7374f47fe75dbfcbcdee6fe.
[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-1] INFO bundle_processor.process_bundle - finish 

[flink-akka.actor.default-dispatcher-2] INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Collection 
Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4675898fd7374f47fe75dbfcbcdee6fe) switched from 
RUNNING to FINISHED.
[Source: Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Source: Collection Source -> 

Build failed in Jenkins: beam_PerformanceTests_Python #1434

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[apilloud] [BEAM-4704] Disable enumerable rules, use direct runner

[migryz] Disable build parallelization for Go due to flakiness

[lcwik] [BEAM-4684] Add integration test for support of @RequiresStableInput

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam15 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 6ed194200df892167094f4ec5967b0ff21e9f669 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6ed194200df892167094f4ec5967b0ff21e9f669
Commit message: "Merge pull request #6389 from apilloud/noenum"
 > git rev-list --no-walk 14cef9c55a9daaac74d560ee1b2ca6d384cfcc18 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4926602678786260718.sh
+ rm -rf 

[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8478309949710790872.sh
+ rm -rf 

[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins357524839501058.sh
+ virtualenv 

New python executable in 

Also creating executable in 

Installing setuptools, pkg_resources, pip, wheel...done.
Running virtualenv with interpreter /usr/bin/python2
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4515826070275651236.sh
+ 

 install --upgrade setuptools pip
Requirement already up-to-date: setuptools in 
./env/.perfkit_env/lib/python2.7/site-packages (40.2.0)
Requirement already up-to-date: pip in 
./env/.perfkit_env/lib/python2.7/site-packages (18.0)
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2997120822204807885.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git 

Cloning into 
'
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins707153573019088.sh
+ 

 install -r 

Collecting absl-py (from -r 

 (line 14))
Collecting jinja2>=2.7 (from -r 

 (line 15))
  Using cached 
https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./env/.perfkit_env/lib/python2.7/site-packages (from -r 

 (line 16)) (40.2.0)
Collecting colorlog[windows]==2.6.0 (from -r 

 (line 17))
  Using cached 
https://files.pythonhosted.org/packages/59/1a/46a1bf2044ad8b30b52fed0f389338c85747e093fe7f51a567f4cb525892/colorlog-2.6.0-py2.py3-none-any.whl
Collecting blinker>=1.3 (from -r 

 (line 18))
Collecting futures>=3.0.3 (from -r 

 (line 19))
  Using cached 

Build failed in Jenkins: beam_PreCommit_Website_Cron #56

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[apilloud] [BEAM-4704] Disable enumerable rules, use direct runner

[migryz] Disable build parallelization for Go due to flakiness

[lcwik] [BEAM-4684] Add integration test for support of @RequiresStableInput

--
[...truncated 9.01 KB...]
file or directory 
'
 not found
file or directory 
'
 not found
file or directory 
'
 not found
Caching disabled for task ':buildSrc:spotlessGroovy': Caching has not been 
enabled for the task
Task ':buildSrc:spotlessGroovy' is not up-to-date because:
  No history is available.
All input files are considered out-of-date for incremental task 
':buildSrc:spotlessGroovy'.
file or directory 
'
 not found
:spotlessGroovy (Thread[Task worker for ':buildSrc',5,main]) completed. Took 
1.637 secs.
:spotlessGroovyCheck (Thread[Task worker for ':buildSrc' Thread 10,5,main]) 
started.

> Task :buildSrc:spotlessGroovyCheck
Skipping task ':buildSrc:spotlessGroovyCheck' as it has no actions.
:spotlessGroovyCheck (Thread[Task worker for ':buildSrc' Thread 10,5,main]) 
completed. Took 0.001 secs.
:spotlessGroovyGradle (Thread[Task worker for ':buildSrc' Thread 10,5,main]) 
started.

> Task :buildSrc:spotlessGroovyGradle
Caching disabled for task ':buildSrc:spotlessGroovyGradle': Caching has not 
been enabled for the task
Task ':buildSrc:spotlessGroovyGradle' is not up-to-date because:
  No history is available.
All input files are considered out-of-date for incremental task 
':buildSrc:spotlessGroovyGradle'.
:spotlessGroovyGradle (Thread[Task worker for ':buildSrc' Thread 10,5,main]) 
completed. Took 0.038 secs.
:spotlessGroovyGradleCheck (Thread[Task worker for ':buildSrc' Thread 
11,5,main]) started.

> Task :buildSrc:spotlessGroovyGradleCheck
Skipping task ':buildSrc:spotlessGroovyGradleCheck' as it has no actions.
:spotlessGroovyGradleCheck (Thread[Task worker for ':buildSrc' Thread 
11,5,main]) completed. Took 0.0 secs.
:spotlessCheck (Thread[Task worker for ':buildSrc' Thread 11,5,main]) started.

> Task :buildSrc:spotlessCheck
Skipping task ':buildSrc:spotlessCheck' as it has no actions.
:spotlessCheck (Thread[Task worker for ':buildSrc' Thread 11,5,main]) 
completed. Took 0.0 secs.
:compileTestJava (Thread[Task worker for ':buildSrc' Thread 11,5,main]) started.

> Task :buildSrc:compileTestJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileTestJava' as it has no source files and no 
previous output files.
:compileTestJava (Thread[Task worker for ':buildSrc' Thread 11,5,main]) 
completed. Took 0.003 secs.
:compileTestGroovy (Thread[Task worker for ':buildSrc' Thread 11,5,main]) 
started.

> Task :buildSrc:compileTestGroovy NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileTestGroovy' as it has no source files and no 
previous output files.
:compileTestGroovy (Thread[Task worker for ':buildSrc' Thread 11,5,main]) 
completed. Took 0.003 secs.
:processTestResources (Thread[Task worker for ':buildSrc' Thread 11,5,main]) 
started.

> Task :buildSrc:processTestResources NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:processTestResources' as it has no source files and no 
previous output files.
:processTestResources (Thread[Task worker for ':buildSrc' Thread 11,5,main]) 
completed. Took 0.001 secs.
:testClasses (Thread[Task worker for ':buildSrc' Thread 11,5,main]) started.

> Task :buildSrc:testClasses UP-TO-DATE
Skipping task ':buildSrc:testClasses' as it has no actions.
:testClasses (Thread[Task worker for ':buildSrc' Thread 11,5,main]) completed. 
Took 0.0 secs.
:test (Thread[Task worker for ':buildSrc' Thread 11,5,main]) started.

> Task :buildSrc:test NO-SOURCE
Skipping task ':buildSrc:test' as it has no source files and no previous output 
files.
:test (Thread[Task worker for ':buildSrc' Thread 11,5,main]) completed. Took 
0.003 secs.
:check (Thread[Task worker for ':buildSrc' Thread 11,5,main]) started.

> Task :buildSrc:check
Skipping task ':buildSrc:check' as it has no actions.
:check (Thread[Task worker for ':buildSrc' Thread 11,5,main]) completed. Took 
0.0 secs.
:build (Thread[Task worker for ':buildSrc' Thread 11,5,main]) started.

> Task :buildSrc:build
Skipping task ':buildSrc:build' as it has no actions.
:build 

[jira] [Work logged] (BEAM-5383) Migrate integration tests for python bigquery io read

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5383?focusedWorklogId=144153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144153
 ]

ASF GitHub Bot logged work on BEAM-5383:


Author: ASF GitHub Bot
Created on: 14/Sep/18 00:45
Start Date: 14/Sep/18 00:45
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6393: [BEAM-5383] migrate 
bigquer_io_read_it_test to beam
URL: https://github.com/apache/beam/pull/6393#issuecomment-421195395
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144153)
Time Spent: 40m  (was: 0.5h)

> Migrate integration tests for python  bigquery io read 
> ---
>
> Key: BEAM-5383
> URL: https://issues.apache.org/jira/browse/BEAM-5383
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins

2018-09-13 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614179#comment-16614179
 ] 

Ankur Goenka commented on BEAM-5283:


Reference bug [https://github.com/pypa/virtualenv/issues/997] 

shebangs limit on the argument length which broke the test.

> Enable Python Portable Flink PostCommit Tests to Jenkins
> 
>
> Key: BEAM-5283
> URL: https://issues.apache.org/jira/browse/BEAM-5283
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: CI
> Fix For: 2.8.0
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_PVR_Flink_Gradle #6

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[apilloud] [BEAM-4704] Disable enumerable rules, use direct runner

--
[...truncated 6.28 MB...]
[grpc-default-executor-0] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - finish 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - finish 

[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4a8e3e9ac021dd141063799bfabe4a9d) switched from 
RUNNING to FINISHED.
[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Freeing task resources for Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4a8e3e9ac021dd141063799bfabe4a9d).
[Source: Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Ensuring all FileSystem streams are closed for task Source: Collection Source 
-> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4a8e3e9ac021dd141063799bfabe4a9d) [FINISHED]
[flink-akka.actor.default-dispatcher-5] INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor - Un-registering task and 
sending final execution state FINISHED to JobManager for task Source: 
Collection Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem 4a8e3e9ac021dd141063799bfabe4a9d.
[flink-akka.actor.default-dispatcher-5] INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Collection 
Source -> 
19Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (4a8e3e9ac021dd141063799bfabe4a9d) switched from 
RUNNING to FINISHED.
[grpc-default-executor-0] INFO sdk_worker.run - Got work 9
[Source: Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Source: Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (cdea398e199c538cf6b19038096c1945) switched from 
RUNNING to FINISHED.
[Source: Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Freeing task resources for Source: Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (cdea398e199c538cf6b19038096c1945).
[Source: Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1)] INFO org.apache.flink.runtime.taskmanager.Task - 
Ensuring all FileSystem streams are closed for task Source: Collection Source 
-> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (cdea398e199c538cf6b19038096c1945) [FINISHED]
[flink-akka.actor.default-dispatcher-5] INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor - Un-registering task and 
sending final execution state FINISHED to JobManager for task Source: 
Collection Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem cdea398e199c538cf6b19038096c1945.
[flink-akka.actor.default-dispatcher-5] INFO 
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Collection 
Source -> 
31assert_that/Create/Read/Impulse.None/jenkins-docker-apache.bintray.io/beam/python:latest:0
 -> ToKeyedWorkItem (1/1) (cdea398e199c538cf6b19038096c1945) switched from 
RUNNING to FINISHED.
[grpc-default-executor-0] INFO sdk_worker.run - Got work 10
[grpc-default-executor-0] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-0] INFO bundle_processor.process_bundle - start 

[grpc-default-executor-0] INFO 

[jira] [Resolved] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins

2018-09-13 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5283.

   Resolution: Fixed
Fix Version/s: 2.8.0

> Enable Python Portable Flink PostCommit Tests to Jenkins
> 
>
> Key: BEAM-5283
> URL: https://issues.apache.org/jira/browse/BEAM-5283
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: CI
> Fix For: 2.8.0
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5378?focusedWorklogId=144140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144140
 ]

ASF GitHub Bot logged work on BEAM-5378:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:47
Start Date: 13/Sep/18 23:47
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #6395: [BEAM-5378] Update go 
wordcap example to work on Dataflow runner
URL: https://github.com/apache/beam/pull/6395#issuecomment-421186949
 
 
   R: @lostluck 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144140)
Time Spent: 20m  (was: 10m)

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I've been spending a day or so running through the example available for the 
> Go SDK in order to see what works and on what runner (direct, dataflow), and 
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache 
> beam and dataflow it would be a tremendous value to have all examples running 
> because many of them have legitimate use-cases behind them. 
> {code:java}
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> ├── contains
> │   └── contains.go
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> │   ├── filter
> │   │   └── filter.go
> │   ├── join
> │   │   └── join.go
> │   ├── max
> │   │   └── max.go
> │   └── tornadoes
> │   └── tornadoes.go
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> ├── forest
> │   └── forest.go
> ├── grades
> │   └── grades.go
> ├── minimal_wordcount
> │   └── minimal_wordcount.go
> ├── multiout
> │   └── multiout.go
> ├── pingpong
> │   └── pingpong.go
> ├── streaming_wordcap
> │   └── wordcap.go
> ├── windowed_wordcount
> │   └── windowed_wordcount.go
> ├── wordcap
> │   └── wordcap.go
> ├── wordcount
> │   └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not 
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not 
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it 
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files 
> from gs://
> But when running this it yields a error from the top.Largest func (discussed 
> in another issue that top.Largest needs to have a serializeable combinator / 
> accumulator)
> ➜  autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019 
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── contains
> │   └── contains.go
> Fails when running debug.Head for some mysterious reason, might have to do 
> with the param passing into the x,y iterator. Frankly I dont know and could 
> not figure.
> But removing the debug.Head call everything works as expected and succeeds.
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> Fails because of extractFn which is a struct is not registered through the 
> beam.RegisterType (is this a must or not?)
> It works as a work around at least
> ➜  combine git:(master) ✗ ./combine 
> --output=fair-app-213019:combineoutput.test --project=fair-app-213019 
> --runner=dataflow --staging_location=gs://203019-staging/ 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:40:50 Running combine
> panic: Failed to serialize 3: ParDo [In(Main): main.WordRow <- {2: 
> main.WordRow/main.WordRow[json] GLO}] -> [Out: KV 

[jira] [Work logged] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5378?focusedWorklogId=144139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144139
 ]

ASF GitHub Bot logged work on BEAM-5378:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:47
Start Date: 13/Sep/18 23:47
Worklog Time Spent: 10m 
  Work Description: aaltay opened a new pull request #6395: [BEAM-5378] 
Update go wordcap example to work on Dataflow runner
URL: https://github.com/apache/beam/pull/6395
 
 
   Update go wordcap example to work on Dataflow runner
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144139)
Time Spent: 10m
Remaining Estimate: 0h

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I've been spending a day or so running through the example available for 

[jira] [Work logged] (BEAM-5365) Migrate integration tests for bigquery_tornadoes

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5365?focusedWorklogId=144138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144138
 ]

ASF GitHub Bot logged work on BEAM-5365:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:45
Start Date: 13/Sep/18 23:45
Worklog Time Spent: 10m 
  Work Description: yifanzou closed pull request #6380: Do Not Merge, 
[BEAM-5365] adding more tests in bigquery_tornadoes_it
URL: https://github.com/apache/beam/pull/6380
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes_it_test.py 
b/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes_it_test.py
index b7e90839ffa..428b2f2da33 100644
--- a/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes_it_test.py
+++ b/sdks/python/apache_beam/examples/cookbook/bigquery_tornadoes_it_test.py
@@ -72,6 +72,40 @@ def test_bigquery_tornadoes_it(self):
 test_pipeline.get_full_options_as_args(**extra_opts))
 
 
+  @attr('IT')
+  def test_bigquery_tornadoes_from_query_it(self):
+test_pipeline = TestPipeline(is_integration_test=True)
+
+# Set extra options to the pipeline for test purpose
+project = test_pipeline.get_option('project')
+
+INPUT_QUERY = ('SELECT month, COUNT(month) AS tornado_count FROM '
+   '[clouddataflow-readonly:samples.weather_stations] '
+   'WHERE tornado=true GROUP BY month')
+
+dataset = 'BigQueryTornadoesIT'
+table = 'monthly_tornadoes_from_query_%s' % int(round(time.time() * 1000))
+output_table = '.'.join([dataset, table])
+query = 'SELECT month, tornado_count FROM [%s]' % output_table
+
+pipeline_verifiers = [PipelineStateMatcher(),
+  BigqueryMatcher(
+  project=project,
+  query=query,
+  checksum=self.DEFAULT_CHECKSUM)]
+extra_opts = {'input': INPUT_QUERY,
+  'output': output_table,
+  'on_success_matcher': all_of(*pipeline_verifiers)}
+
+# Register cleanup before pipeline execution.
+self.addCleanup(utils.delete_bq_table, project, dataset, table)
+
+# Get pipeline options from command argument: --test-pipeline-options,
+# and start pipeline job by calling pipeline main function.
+bigquery_tornadoes.run(
+test_pipeline.get_full_options_as_args(**extra_opts))
+
+
 if __name__ == '__main__':
   logging.getLogger().setLevel(logging.INFO)
   unittest.main()


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144138)
Time Spent: 1h 20m  (was: 1h 10m)

> Migrate integration tests for bigquery_tornadoes
> 
>
> Key: BEAM-5365
> URL: https://issues.apache.org/jira/browse/BEAM-5365
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4704) String operations yield incorrect results when executed through SQL shell

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4704?focusedWorklogId=144137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144137
 ]

ASF GitHub Bot logged work on BEAM-4704:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:39
Start Date: 13/Sep/18 23:39
Worklog Time Spent: 10m 
  Work Description: apilloud closed pull request #6389: [BEAM-4704] Disable 
enumerable rules, use direct runner
URL: https://github.com/apache/beam/pull/6389
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/JdbcDriver.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/JdbcDriver.java
index 4c0345401d1..c47eb7837ce 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/JdbcDriver.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/JdbcDriver.java
@@ -45,6 +45,7 @@
 import org.apache.calcite.plan.RelOptPlanner;
 import org.apache.calcite.plan.RelOptRule;
 import org.apache.calcite.plan.RelTraitDef;
+import org.apache.calcite.prepare.CalcitePrepareImpl;
 import org.apache.calcite.rel.RelCollationTraitDef;
 import org.apache.calcite.rel.rules.CalcRemoveRule;
 import org.apache.calcite.rel.rules.SortRemoveRule;
@@ -98,6 +99,10 @@
   planner.removeRule(CalcRemoveRule.INSTANCE);
   planner.removeRule(SortRemoveRule.INSTANCE);
 
+  for (RelOptRule rule : CalcitePrepareImpl.ENUMERABLE_RULES) {
+planner.removeRule(rule);
+  }
+
   List relTraitDefs = new 
ArrayList<>(planner.getRelTraitDefs());
   planner.clearRelTraitDefs();
   for (RelTraitDef def : relTraitDefs) {
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
index 271798a67f1..b65d85fb4fd 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
@@ -540,7 +540,8 @@ private static BeamSqlExpression 
getBeamSqlExpression(RexNode rexNode) {
   }
 
   private static boolean isDateNode(SqlTypeName type, Object value) {
-return (type == SqlTypeName.DATE || type == SqlTypeName.TIMESTAMP) && 
value instanceof Calendar;
+return (type == SqlTypeName.DATE || type == SqlTypeName.TIME || type == 
SqlTypeName.TIMESTAMP)
+&& value instanceof Calendar;
   }
 
   @Override
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java
index f90c1bdfb9f..e644c0c735b 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamEnumerableConverter.java
@@ -18,8 +18,11 @@
 package org.apache.beam.sdk.extensions.sql.impl.rel;
 
 import static com.google.common.base.Preconditions.checkArgument;
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static org.apache.calcite.avatica.util.DateTimeUtils.MILLIS_PER_DAY;
 
 import java.io.IOException;
+import java.util.Arrays;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Map;
@@ -33,6 +36,7 @@
 import org.apache.beam.sdk.Pipeline.PipelineVisitor;
 import org.apache.beam.sdk.PipelineResult;
 import org.apache.beam.sdk.PipelineResult.State;
+import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils;
 import org.apache.beam.sdk.metrics.Counter;
 import org.apache.beam.sdk.metrics.MetricNameFilter;
 import org.apache.beam.sdk.metrics.MetricQueryResults;
@@ -258,7 +262,16 @@ public void processElement(ProcessContext context) {
   private static Object fieldToAvatica(Schema.FieldType type, Object 
beamValue) {
 switch (type.getTypeName()) {
   case DATETIME:
-return ((ReadableInstant) beamValue).getMillis();
+if (Arrays.equals(type.getMetadata(), 
CalciteUtils.TIMESTAMP.getMetadata())) {
+  return ((ReadableInstant) beamValue).getMillis();
+} else if (Arrays.equals(type.getMetadata(), 
CalciteUtils.TIME.getMetadata())) {
+  return (int) ((ReadableInstant) beamValue).getMillis();
+} else if 

[beam] branch master updated (6e7369e -> 6ed1942)

2018-09-13 Thread apilloud
This is an automated email from the ASF dual-hosted git repository.

apilloud pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 6e7369e  [BEAM-4684] Add integration test for support of 
@RequiresStableInput (#6220)
 add 890956d  [BEAM-4704] Disable enumerable rules, use direct runner
 add 6ed1942  Merge pull request #6389 from apilloud/noenum

No new revisions were added by this update.

Summary of changes:
 .../apache/beam/sdk/extensions/sql/impl/JdbcDriver.java   |  5 +
 .../sql/impl/interpreter/BeamSqlFnExecutor.java   |  3 ++-
 .../extensions/sql/impl/rel/BeamEnumerableConverter.java  | 15 ++-
 .../beam/sdk/extensions/sql/impl/rel/BeamValuesRel.java   |  7 +++
 4 files changed, 28 insertions(+), 2 deletions(-)



[jira] [Commented] (BEAM-4495) Create website pre-commits for apache/beam repository

2018-09-13 Thread Alan Myrvold (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614157#comment-16614157
 ] 

Alan Myrvold commented on BEAM-4495:


https://github.com/apache/beam/pull/6282

> Create website pre-commits for apache/beam repository
> -
>
> Key: BEAM-4495
> URL: https://issues.apache.org/jira/browse/BEAM-4495
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing, website
>Reporter: Scott Wegner
>Assignee: Udi Meiri
>Priority: Major
>  Labels: beam-site-automation-reliability
> Fix For: Not applicable
>
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5383) Migrate integration tests for python bigquery io read

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5383?focusedWorklogId=144129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144129
 ]

ASF GitHub Bot logged work on BEAM-5383:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:22
Start Date: 13/Sep/18 23:22
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6393: [BEAM-5383] migrate 
bigquer_io_read_it_test to beam
URL: https://github.com/apache/beam/pull/6393#issuecomment-421182760
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144129)
Time Spent: 0.5h  (was: 20m)

> Migrate integration tests for python  bigquery io read 
> ---
>
> Key: BEAM-5383
> URL: https://issues.apache.org/jira/browse/BEAM-5383
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5383) Migrate integration tests for python bigquery io read

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5383?focusedWorklogId=144122=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144122
 ]

ASF GitHub Bot logged work on BEAM-5383:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:19
Start Date: 13/Sep/18 23:19
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6393: [BEAM-5383] migrate 
bigquer_io_read_it_test to beam
URL: https://github.com/apache/beam/pull/6393#issuecomment-421182225
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144122)
Time Spent: 20m  (was: 10m)

> Migrate integration tests for python  bigquery io read 
> ---
>
> Key: BEAM-5383
> URL: https://issues.apache.org/jira/browse/BEAM-5383
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4496) Create Jenkins job to push generated HTML to asf-site branch

2018-09-13 Thread Alan Myrvold (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614156#comment-16614156
 ] 

Alan Myrvold commented on BEAM-4496:


Need to set the label git-websites in the job to be able to push to beam 
asf-site branch?

Example script, 
[https://github.com/apache/hbase/blob/master/dev-support/jenkins-scripts/generate-hbase-website.sh]

which does a git push origin asf-site

> Create Jenkins job to push generated HTML to asf-site branch
> 
>
> Key: BEAM-4496
> URL: https://issues.apache.org/jira/browse/BEAM-4496
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5383) Migrate integration tests for python bigquery io read

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5383?focusedWorklogId=144121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144121
 ]

ASF GitHub Bot logged work on BEAM-5383:


Author: ASF GitHub Bot
Created on: 13/Sep/18 23:17
Start Date: 13/Sep/18 23:17
Worklog Time Spent: 10m 
  Work Description: yifanzou opened a new pull request #6393: [BEAM-5383] 
migrate bigquer_io_read_it_test to beam
URL: https://github.com/apache/beam/pull/6393
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144121)
Time Spent: 10m
Remaining Estimate: 0h

> Migrate integration tests for python  bigquery io read 
> ---
>
> Key: BEAM-5383
> URL: https://issues.apache.org/jira/browse/BEAM-5383
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (BEAM-4499) Migrate Apache website publishing to use apache/beam asf-site branch

2018-09-13 Thread Alan Myrvold (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614155#comment-16614155
 ] 

Alan Myrvold commented on BEAM-4499:


Guessing this is the file to change to move from beam-site to beam repository?

https://github.com/apache/infrastructure-puppet/blob/deployment/modules/gitwcsub/files/config/gitwcsub.cfg

> Migrate Apache website publishing to use apache/beam asf-site branch
> 
>
> Key: BEAM-4499
> URL: https://issues.apache.org/jira/browse/BEAM-4499
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: beam-site-automation-reliability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated: [BEAM-4684] Add integration test for support of @RequiresStableInput (#6220)

2018-09-13 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 6e7369e  [BEAM-4684] Add integration test for support of 
@RequiresStableInput (#6220)
6e7369e is described below

commit 6e7369ee4dbc3e94be7261a5f051526652d4
Author: Yueyang Qiu 
AuthorDate: Thu Sep 13 15:23:05 2018 -0700

[BEAM-4684] Add integration test for support of @RequiresStableInput (#6220)
---
 runners/google-cloud-dataflow-java/build.gradle|  21 +++
 sdks/java/core/build.gradle|   2 +
 .../beam/sdk/testing/FileChecksumMatcher.java  |   2 +-
 .../beam/sdk/testing/SerializableMatchers.java |   2 +-
 .../sdk/util/FilePatternMatchingShardedFile.java   | 140 ++
 .../apache/beam/sdk/util/NumberedShardedFile.java  |  16 ---
 .../org/apache/beam/sdk/RequiresStableInputIT.java | 160 +
 .../util/FilePatternMatchingShardedFileTest.java   | 143 ++
 8 files changed, 468 insertions(+), 18 deletions(-)

diff --git a/runners/google-cloud-dataflow-java/build.gradle 
b/runners/google-cloud-dataflow-java/build.gradle
index 9682e94..168ac97 100644
--- a/runners/google-cloud-dataflow-java/build.gradle
+++ b/runners/google-cloud-dataflow-java/build.gradle
@@ -48,6 +48,7 @@ test {
 
 configurations {
   validatesRunner
+  coreSDKJavaIntegrationTest
   examplesJavaIntegrationTest
   googleCloudPlatformIntegrationTest
 }
@@ -89,6 +90,8 @@ dependencies {
   validatesRunner project(path: project.path, configuration: "shadow")
   validatesRunner library.java.hamcrest_core
   validatesRunner library.java.hamcrest_library
+  coreSDKJavaIntegrationTest project(path: project.path, configuration: 
"shadow")
+  coreSDKJavaIntegrationTest project(path: ":beam-sdks-java-core", 
configuration: "shadowTest")
   examplesJavaIntegrationTest project(path: project.path, configuration: 
"shadow")
   examplesJavaIntegrationTest project(path: ":beam-examples-java", 
configuration: "shadowTest")
   googleCloudPlatformIntegrationTest project(path: project.path, 
configuration: "shadow")
@@ -175,11 +178,29 @@ task examplesJavaIntegrationTest(type: Test) {
   useJUnit { }
 }
 
+task coreSDKJavaIntegrationTest(type: Test) {
+  group = "Verification"
+  def dataflowProject = project.findProperty('dataflowProject') ?: 
'apache-beam-testing'
+  def dataflowTempRoot = project.findProperty('dataflowTempRoot') ?: 
'gs://temp-storage-for-end-to-end-tests'
+  systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  "--runner=TestDataflowRunner",
+  "--project=${dataflowProject}",
+  "--tempRoot=${dataflowTempRoot}",
+  ])
+
+  include '**/*IT.class'
+  maxParallelForks 4
+  classpath = configurations.coreSDKJavaIntegrationTest
+  testClassesDirs = 
files(project(":beam-sdks-java-core").sourceSets.test.output.classesDirs)
+  useJUnit { }
+}
+
 task postCommit {
   group = "Verification"
   description = "Various integration tests using the Dataflow runner."
   dependsOn googleCloudPlatformIntegrationTest
   dependsOn examplesJavaIntegrationTest
+  dependsOn coreSDKJavaIntegrationTest
 }
 
 def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
diff --git a/sdks/java/core/build.gradle b/sdks/java/core/build.gradle
index b7c3659..3fb2d55 100644
--- a/sdks/java/core/build.gradle
+++ b/sdks/java/core/build.gradle
@@ -72,5 +72,7 @@ dependencies {
   shadowTest library.java.guava_testlib
   shadowTest library.java.slf4j_jdk14
   shadowTest library.java.mockito_core
+  shadowTest library.java.hamcrest_core
+  shadowTest library.java.hamcrest_library
   shadowTest "com.esotericsoftware.kryo:kryo:2.21"
 }
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
index e2755bd..0655c89 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
@@ -151,7 +151,7 @@ public class FileChecksumMatcher extends 
TypeSafeMatcher
 return actualChecksum;
   }
 
-  private String computeHash(@Nonnull List strs) {
+  private static String computeHash(@Nonnull List strs) {
 if (strs.isEmpty()) {
   return Hashing.sha1().hashString("", StandardCharsets.UTF_8).toString();
 }
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
index fd4aaa3..e99de1e 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
@@ -54,7 +54,7 @@ import org.hamcrest.Matchers;
  * iterable is 

[jira] [Work logged] (BEAM-4684) Support @RequiresStableInput on Dataflow runner in Java SDK

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4684?focusedWorklogId=144100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144100
 ]

ASF GitHub Bot logged work on BEAM-4684:


Author: ASF GitHub Bot
Created on: 13/Sep/18 22:23
Start Date: 13/Sep/18 22:23
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #6220: [BEAM-4684] Add 
integration test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/runners/google-cloud-dataflow-java/build.gradle 
b/runners/google-cloud-dataflow-java/build.gradle
index 9682e94604a..168ac9745fc 100644
--- a/runners/google-cloud-dataflow-java/build.gradle
+++ b/runners/google-cloud-dataflow-java/build.gradle
@@ -48,6 +48,7 @@ test {
 
 configurations {
   validatesRunner
+  coreSDKJavaIntegrationTest
   examplesJavaIntegrationTest
   googleCloudPlatformIntegrationTest
 }
@@ -89,6 +90,8 @@ dependencies {
   validatesRunner project(path: project.path, configuration: "shadow")
   validatesRunner library.java.hamcrest_core
   validatesRunner library.java.hamcrest_library
+  coreSDKJavaIntegrationTest project(path: project.path, configuration: 
"shadow")
+  coreSDKJavaIntegrationTest project(path: ":beam-sdks-java-core", 
configuration: "shadowTest")
   examplesJavaIntegrationTest project(path: project.path, configuration: 
"shadow")
   examplesJavaIntegrationTest project(path: ":beam-examples-java", 
configuration: "shadowTest")
   googleCloudPlatformIntegrationTest project(path: project.path, 
configuration: "shadow")
@@ -175,11 +178,29 @@ task examplesJavaIntegrationTest(type: Test) {
   useJUnit { }
 }
 
+task coreSDKJavaIntegrationTest(type: Test) {
+  group = "Verification"
+  def dataflowProject = project.findProperty('dataflowProject') ?: 
'apache-beam-testing'
+  def dataflowTempRoot = project.findProperty('dataflowTempRoot') ?: 
'gs://temp-storage-for-end-to-end-tests'
+  systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  "--runner=TestDataflowRunner",
+  "--project=${dataflowProject}",
+  "--tempRoot=${dataflowTempRoot}",
+  ])
+
+  include '**/*IT.class'
+  maxParallelForks 4
+  classpath = configurations.coreSDKJavaIntegrationTest
+  testClassesDirs = 
files(project(":beam-sdks-java-core").sourceSets.test.output.classesDirs)
+  useJUnit { }
+}
+
 task postCommit {
   group = "Verification"
   description = "Various integration tests using the Dataflow runner."
   dependsOn googleCloudPlatformIntegrationTest
   dependsOn examplesJavaIntegrationTest
+  dependsOn coreSDKJavaIntegrationTest
 }
 
 def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
diff --git a/sdks/java/core/build.gradle b/sdks/java/core/build.gradle
index b7c3659c196..3fb2d55c802 100644
--- a/sdks/java/core/build.gradle
+++ b/sdks/java/core/build.gradle
@@ -72,5 +72,7 @@ dependencies {
   shadowTest library.java.guava_testlib
   shadowTest library.java.slf4j_jdk14
   shadowTest library.java.mockito_core
+  shadowTest library.java.hamcrest_core
+  shadowTest library.java.hamcrest_library
   shadowTest "com.esotericsoftware.kryo:kryo:2.21"
 }
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
index e2755bd3386..0655c892112 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/FileChecksumMatcher.java
@@ -151,7 +151,7 @@ private String getActualChecksum() {
 return actualChecksum;
   }
 
-  private String computeHash(@Nonnull List strs) {
+  private static String computeHash(@Nonnull List strs) {
 if (strs.isEmpty()) {
   return Hashing.sha1().hashString("", StandardCharsets.UTF_8).toString();
 }
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
index fd4aaa349c5..e99de1e79c4 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SerializableMatchers.java
@@ -54,7 +54,7 @@
  * iterable is undefined, use a matcher like {@code kv(equalTo("some key"), 
containsInAnyOrder(1, 2,
  * 3))}.
  */
-class SerializableMatchers implements Serializable {
+public class SerializableMatchers implements Serializable {
 
   // Serializable only because of capture by anonymous inner classes
   private SerializableMatchers() {} 

[jira] [Work logged] (BEAM-4704) String operations yield incorrect results when executed through SQL shell

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4704?focusedWorklogId=144093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144093
 ]

ASF GitHub Bot logged work on BEAM-4704:


Author: ASF GitHub Bot
Created on: 13/Sep/18 21:59
Start Date: 13/Sep/18 21:59
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6389: [BEAM-4704] Disable 
enumerable rules, use direct runner
URL: https://github.com/apache/beam/pull/6389#issuecomment-421166710
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144093)
Time Spent: 50m  (was: 40m)

> String operations yield incorrect results when executed through SQL shell
> -
>
> Key: BEAM-4704
> URL: https://issues.apache.org/jira/browse/BEAM-4704
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {{TRIM}} is defined to trim _all_ the characters in the first string from the 
> string-to-be-trimmed. Calcite has an incorrect implementation of this. We use 
> our own fixed implementation. But when executed through the SQL shell, the 
> results do not match what we get from the PTransform path. Here two test 
> cases that pass on {{master}} but are incorrect in the shell:
> {code:sql}
> BeamSQL> select TRIM(LEADING 'eh' FROM 'hehe__hehe');
> ++
> | EXPR$0 |
> ++
> | hehe__hehe |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(TRAILING 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(BOTH 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

2018-09-13 Thread Robert Bradshaw (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614093#comment-16614093
 ] 

Robert Bradshaw commented on BEAM-3106:
---

Our main requirements now specify version ranges (generally guided by semantic 
versioning); we should unpin our gcp requirements when possible as well 
wherever possible. 

> Consider not pinning all python dependencies, or moving them to 
> requirements.txt
> 
>
> Key: BEAM-3106
> URL: https://issues.apache.org/jira/browse/BEAM-3106
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Affects Versions: 2.1.0
> Environment: python
>Reporter: Maximilian Roos
>Priority: Major
>
> Currently all python dependencies are [pinned or 
> capped|https://github.com/apache/beam/blob/master/sdks/python/setup.py#L97]
> While there's a good argument for supplying a `requirements.txt` with well 
> tested dependencies, having them specified in `setup.py` forces them to an 
> exact state on each install of Beam. This makes using Beam in any environment 
> with other libraries nigh on impossible. 
> This is particularly severe for the `gcp` dependencies, where we have 
> libraries that won't work with an older version (but Beam _does_ work with an 
> newer version). We have to do a bunch of gymnastics to get the correct 
> versions installed because of this. Unfortunately, airflow repeats this 
> practice and conflicts on a number of dependencies, adding further 
> complication (but, again there is no real conflict).
> I haven't seen this practice outside of the Apache & Google ecosystem - for 
> example no libraries in numerical python do this. Here's a [discussion on 
> SO|https://stackoverflow.com/questions/28509481/should-i-pin-my-python-dependencies-versions]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5384) [SQL] Calcite Doesn't optimizes away LogicalProject

2018-09-13 Thread Anton Kedin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kedin updated BEAM-5384:
--
Description: 
*From 
[https://stackoverflow.com/questions/52313324/beam-sql-wont-work-when-using-aggregation-in-statement-cannot-plan-execution]
 :*

I have a basic Beam pipeline that reads from GCS, does a Beam SQL transform and 
writes the results to BigQuery.

When I don't do any aggregation in my SQL statement it works fine:
{code:java}
..
PCollection outputStream =
sqlRows.apply(
"sql_transform",
SqlTransform.query("select views from PCOLLECTION"));
outputStream.setCoder(SCHEMA.getRowCoder());
..
{code}
However, when I try to aggregate with a sum then it fails (throws a 
CannotPlanException exception):
{code:java}
..
PCollection outputStream =
sqlRows.apply(
"sql_transform",
SqlTransform.query("select wikimedia_project, 
sum(views) from PCOLLECTION group by wikimedia_project"));
outputStream.setCoder(SCHEMA.getRowCoder());
..
{code}
Stacktrace:
{code:java}
Step #1: 11:47:37,562 0[main] INFO  
org.apache.beam.runners.dataflow.DataflowRunner - PipelineOptions.filesToStage 
was not specified. Defaulting to files from the classpath: will stage 117 
files. Enable logging at DEBUG level to see which files will be staged.
Step #1: 11:47:39,845 2283 [main] INFO  
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQL:
Step #1: SELECT `PCOLLECTION`.`wikimedia_project`, SUM(`PCOLLECTION`.`views`)
Step #1: FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`
Step #1: GROUP BY `PCOLLECTION`.`wikimedia_project`
Step #1: 11:47:40,387 2825 [main] INFO  
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQLPlan>
Step #1: LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
Step #1:   BeamIOSourceRel(table=[[beam, PCOLLECTION]])
Step #1: 
Step #1: Exception in thread "main" 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException:
 Node [rel#7:Subset#1.BEAM_LOGICAL.[]] could not be implemented; planner state:
Step #1: 
Step #1: Root: rel#7:Subset#1.BEAM_LOGICAL.[]
Step #1: Original rel:
Step #1: LogicalAggregate(subset=[rel#7:Subset#1.BEAM_LOGICAL.[]], group=[{0}], 
EXPR$1=[SUM($1)]): rowcount = 10.0, cumulative cost = {11.375000476837158 rows, 
0.0 cpu, 0.0 io}, id = 5
Step #1:   BeamIOSourceRel(subset=[rel#4:Subset#0.BEAM_LOGICAL.[]], 
table=[[beam, PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 
101.0 cpu, 0.0 io}, id = 2
Step #1: 
Step #1: Sets:
Step #1: Set#0, type: RecordType(VARCHAR wikimedia_project, BIGINT views)
Step #1:rel#4:Subset#0.BEAM_LOGICAL.[], best=rel#2, importance=0.81
Step #1:rel#2:BeamIOSourceRel.BEAM_LOGICAL.[](table=[beam, 
PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
Step #1:rel#10:Subset#0.ENUMERABLE.[], best=rel#9, importance=0.405
Step #1:
rel#9:BeamEnumerableConverter.ENUMERABLE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[]),
 rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, 
1.7976931348623157E308 cpu, 1.7976931348623157E308 io}
Step #1: Set#1, type: RecordType(VARCHAR wikimedia_project, BIGINT EXPR$1)
Step #1:rel#6:Subset#1.NONE.[], best=null, importance=0.9
Step #1:
rel#5:LogicalAggregate.NONE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[],group={0},EXPR$1=SUM($1)),
 rowcount=10.0, cumulative cost={inf}
Step #1:rel#7:Subset#1.BEAM_LOGICAL.[], best=null, importance=1.0
Step #1:
rel#8:AbstractConverter.BEAM_LOGICAL.[](input=rel#6:Subset#1.NONE.[],convention=BEAM_LOGICAL,sort=[]),
 rowcount=10.0, cumulative cost={inf}
Step #1: 
Step #1: 
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:448)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:298)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:666)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:368)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:336)
Step #1:at 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner.convertToBeamRel(BeamQueryPlanner.java:138)
Step #1:at 
org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery(BeamSqlEnv.java:105)
Step #1:at 
org.apache.beam.sdk.extensions.sql.SqlTransform.expand(SqlTransform.java:96)
Step #1:at 

[jira] [Assigned] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins

2018-09-13 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka reassigned BEAM-5283:
--

Assignee: Ankur Goenka  (was: Jason Kuster)

> Enable Python Portable Flink PostCommit Tests to Jenkins
> 
>
> Key: BEAM-5283
> URL: https://issues.apache.org/jira/browse/BEAM-5283
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: CI
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5384) [SQL] Calcite Doesn't optimizes away LogicalProject

2018-09-13 Thread Anton Kedin (JIRA)
Anton Kedin created BEAM-5384:
-

 Summary: [SQL] Calcite Doesn't optimizes away LogicalProject
 Key: BEAM-5384
 URL: https://issues.apache.org/jira/browse/BEAM-5384
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Reporter: Anton Kedin


*From 
https://stackoverflow.com/questions/52313324/beam-sql-wont-work-when-using-aggregation-in-statement-cannot-plan-execution
 *:

I have a basic Beam pipeline that reads from GCS, does a Beam SQL transform and 
writes the results to BigQuery.

When I don't do any aggregation in my SQL statement it works fine:

{code}
..
PCollection outputStream =
sqlRows.apply(
"sql_transform",
SqlTransform.query("select views from PCOLLECTION"));
outputStream.setCoder(SCHEMA.getRowCoder());
..
{code}

However, when I try to aggregate with a sum then it fails (throws a 
CannotPlanException exception):

{code}
..
PCollection outputStream =
sqlRows.apply(
"sql_transform",
SqlTransform.query("select wikimedia_project, 
sum(views) from PCOLLECTION group by wikimedia_project"));
outputStream.setCoder(SCHEMA.getRowCoder());
..
{code}

Stacktrace:

{code}
Step #1: 11:47:37,562 0[main] INFO  
org.apache.beam.runners.dataflow.DataflowRunner - PipelineOptions.filesToStage 
was not specified. Defaulting to files from the classpath: will stage 117 
files. Enable logging at DEBUG level to see which files will be staged.
Step #1: 11:47:39,845 2283 [main] INFO  
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQL:
Step #1: SELECT `PCOLLECTION`.`wikimedia_project`, SUM(`PCOLLECTION`.`views`)
Step #1: FROM `beam`.`PCOLLECTION` AS `PCOLLECTION`
Step #1: GROUP BY `PCOLLECTION`.`wikimedia_project`
Step #1: 11:47:40,387 2825 [main] INFO  
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner - SQLPlan>
Step #1: LogicalAggregate(group=[{0}], EXPR$1=[SUM($1)])
Step #1:   BeamIOSourceRel(table=[[beam, PCOLLECTION]])
Step #1: 
Step #1: Exception in thread "main" 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException:
 Node [rel#7:Subset#1.BEAM_LOGICAL.[]] could not be implemented; planner state:
Step #1: 
Step #1: Root: rel#7:Subset#1.BEAM_LOGICAL.[]
Step #1: Original rel:
Step #1: LogicalAggregate(subset=[rel#7:Subset#1.BEAM_LOGICAL.[]], group=[{0}], 
EXPR$1=[SUM($1)]): rowcount = 10.0, cumulative cost = {11.375000476837158 rows, 
0.0 cpu, 0.0 io}, id = 5
Step #1:   BeamIOSourceRel(subset=[rel#4:Subset#0.BEAM_LOGICAL.[]], 
table=[[beam, PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 
101.0 cpu, 0.0 io}, id = 2
Step #1: 
Step #1: Sets:
Step #1: Set#0, type: RecordType(VARCHAR wikimedia_project, BIGINT views)
Step #1:rel#4:Subset#0.BEAM_LOGICAL.[], best=rel#2, importance=0.81
Step #1:rel#2:BeamIOSourceRel.BEAM_LOGICAL.[](table=[beam, 
PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io}
Step #1:rel#10:Subset#0.ENUMERABLE.[], best=rel#9, importance=0.405
Step #1:
rel#9:BeamEnumerableConverter.ENUMERABLE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[]),
 rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, 
1.7976931348623157E308 cpu, 1.7976931348623157E308 io}
Step #1: Set#1, type: RecordType(VARCHAR wikimedia_project, BIGINT EXPR$1)
Step #1:rel#6:Subset#1.NONE.[], best=null, importance=0.9
Step #1:
rel#5:LogicalAggregate.NONE.[](input=rel#4:Subset#0.BEAM_LOGICAL.[],group={0},EXPR$1=SUM($1)),
 rowcount=10.0, cumulative cost={inf}
Step #1:rel#7:Subset#1.BEAM_LOGICAL.[], best=null, importance=1.0
Step #1:
rel#8:AbstractConverter.BEAM_LOGICAL.[](input=rel#6:Subset#1.NONE.[],convention=BEAM_LOGICAL,sort=[]),
 rowcount=10.0, cumulative cost={inf}
Step #1: 
Step #1: 
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset$CheapestPlanReplacer.visit(RelSubset.java:448)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.RelSubset.buildCheapestPlan(RelSubset.java:298)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:666)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:368)
Step #1:at 
org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:336)
Step #1:at 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner.convertToBeamRel(BeamQueryPlanner.java:138)
Step #1:at 
org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery(BeamSqlEnv.java:105)

[jira] [Work logged] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?focusedWorklogId=144083=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144083
 ]

ASF GitHub Bot logged work on BEAM-5375:


Author: ASF GitHub Bot
Created on: 13/Sep/18 21:19
Start Date: 13/Sep/18 21:19
Worklog Time Spent: 10m 
  Work Description: rangadi commented on issue #6391: [BEAM-5375] KafkaIO : 
Handle runtime exceptions while fetching from Kafka better. 
URL: https://github.com/apache/beam/pull/6391#issuecomment-421156672
 
 
   +R: @iemejia


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144083)
Time Spent: 50m  (was: 40m)

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|[https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5337?focusedWorklogId=144082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144082
 ]

ASF GitHub Bot logged work on BEAM-5337:


Author: ASF GitHub Bot
Created on: 13/Sep/18 21:17
Start Date: 13/Sep/18 21:17
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6385: 
[BEAM-5337] Fix flaky test UnboundedSourceWrapperTest#testValueEmission
URL: https://github.com/apache/beam/pull/6385#discussion_r217537490
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/UnboundedSourceWrapperTest.java
 ##
 @@ -187,10 +188,9 @@ public void collect(
 @Override
 public void close() {}
   });
-} catch (SuccessException e) {
+} finally {
   processingTimeUpdateThread.interrupt();
 
 Review comment:
   In a low probability case when the thread was not sleeping at the time of 
interrupt, the interrupted exception will not be thrown. 
   Instead of interrupting shall we use an atomic boolean etc to terminate the 
while loop?
   Alternatively we can also check on Thread.currentThread().isInterrupted() .


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144082)
Time Spent: 1h  (was: 50m)

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?focusedWorklogId=144075=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144075
 ]

ASF GitHub Bot logged work on BEAM-5375:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:59
Start Date: 13/Sep/18 20:59
Worklog Time Spent: 10m 
  Work Description: rangadi commented on a change in pull request #6391: 
[BEAM-5375] KafkaIO : Handle runtime exceptions while fetching from Kafka 
better. 
URL: https://github.com/apache/beam/pull/6391#discussion_r217531315
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
 ##
 @@ -570,28 +570,33 @@ Instant updateAndGetWatermark() {
   private void consumerPollLoop() {
 // Read in a loop and enqueue the batch of records, if any, to 
availableRecordsQueue.
 
-ConsumerRecords records = ConsumerRecords.empty();
-while (!closed.get()) {
 
 Review comment:
   Quick note: Nothing has changed inside the while loop. The whole block 
placed under another try-catch clause. You can set 'Hide whitespace changes' in 
'Diff Settings' at the top of this diff in github. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144075)
Time Spent: 40m  (was: 0.5h)

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|[https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?focusedWorklogId=144073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144073
 ]

ASF GitHub Bot logged work on BEAM-5375:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:56
Start Date: 13/Sep/18 20:56
Worklog Time Spent: 10m 
  Work Description: rangadi commented on a change in pull request #6391: 
[BEAM-5375] KafkaIO : Handle runtime exceptions while fetching from Kafka 
better. 
URL: https://github.com/apache/beam/pull/6391#discussion_r217531315
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
 ##
 @@ -570,28 +570,33 @@ Instant updateAndGetWatermark() {
   private void consumerPollLoop() {
 // Read in a loop and enqueue the batch of records, if any, to 
availableRecordsQueue.
 
-ConsumerRecords records = ConsumerRecords.empty();
-while (!closed.get()) {
 
 Review comment:
   Quick note: Nothing has changed inside the while loop. The whole block 
placed under another try-catch clause. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144073)
Time Spent: 0.5h  (was: 20m)

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|[https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?focusedWorklogId=144072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144072
 ]

ASF GitHub Bot logged work on BEAM-5375:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:55
Start Date: 13/Sep/18 20:55
Worklog Time Spent: 10m 
  Work Description: rangadi commented on a change in pull request #6391: 
[BEAM-5375] KafkaIO : Handle runtime exceptions while fetching from Kafka 
better. 
URL: https://github.com/apache/beam/pull/6391#discussion_r217531315
 
 

 ##
 File path: 
sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaUnboundedReader.java
 ##
 @@ -570,28 +570,33 @@ Instant updateAndGetWatermark() {
   private void consumerPollLoop() {
 // Read in a loop and enqueue the batch of records, if any, to 
availableRecordsQueue.
 
-ConsumerRecords records = ConsumerRecords.empty();
-while (!closed.get()) {
 
 Review comment:
   Quick note: Nothing has changed inside whit while loop. The whole block 
placed under another try-catch clause. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144072)
Time Spent: 20m  (was: 10m)

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|[https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5283) Enable Python Portable Flink PostCommit Tests to Jenkins

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5283?focusedWorklogId=144071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144071
 ]

ASF GitHub Bot logged work on BEAM-5283:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:55
Start Date: 13/Sep/18 20:55
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6340: [BEAM-5283] Fixing 
Flink Post commit jenkins task
URL: https://github.com/apache/beam/pull/6340#issuecomment-421150015
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144071)
Time Spent: 10.5h  (was: 10h 20m)

> Enable Python Portable Flink PostCommit Tests to Jenkins
> 
>
> Key: BEAM-5283
> URL: https://issues.apache.org/jira/browse/BEAM-5283
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Ankur Goenka
>Assignee: Jason Kuster
>Priority: Major
>  Labels: CI
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5375) KafkaIO reader should handle runtime exceptions kafka client

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5375?focusedWorklogId=144070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144070
 ]

ASF GitHub Bot logged work on BEAM-5375:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:53
Start Date: 13/Sep/18 20:53
Worklog Time Spent: 10m 
  Work Description: rangadi opened a new pull request #6391: [BEAM-5375] 
KafkaIO : Handle runtime exceptions while fetching from Kafka better. 
URL: https://github.com/apache/beam/pull/6391
 
 
Unrecoverable exceptions in Kafka fetch should result in IOException in the 
reader.
   
   KafkaConsumer.poll() could throw unrecoverable exceptions, KafkaIO didn't 
handle
   it well. The poll thread just silently died. It should result in an 
IOException
   thrown inside start()/advance() unbounded reader.
   
   Fix: The consumer poll thread saves the exception before exiting, and the 
reader checks for it when before trying to serve more records. Added a unit 
test to verify.
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144070)
Time Spent: 10m
Remaining Estimate: 0h

> KafkaIO reader should handle runtime exceptions kafka client
> 
>
> Key: BEAM-5375
> URL: https://issues.apache.org/jira/browse/BEAM-5375
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.7.0
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> KafkaIO reader might stop reading from Kafka without any explicit error 
> message if KafkaConsumer throws a runtime exception while polling for 
> messages. One of the Dataflow customers encountered this issue (see [user@ 
> thread|[https://lists.apache.org/thread.html/c0cf8f45f567a0623592e2d8340f5288e3e774b59bca985aec410a81@%3Cuser.beam.apache.org%3E])]
> 'consumerPollThread()' in KafkaIO deliberately avoided catching runtime 
> exceptions. It shoud handle it.. stuff happens at runtime. 
> It should result in 'IOException' from start()/advance(). The runners will 
> handle properly reporting and closing the readers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

2018-09-13 Thread Scott Jungwirth (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614047#comment-16614047
 ] 

Scott Jungwirth edited comment on BEAM-3106 at 9/13/18 8:50 PM:


I just ran into this issue using Google's Cloud Composer (managed airflow) 
after adding the 2.6.0 (current latest) beam sdk pypy package 
(apache-beam[gcp]>=2.6.0). Looking at the build log, it looks like 
apache-beam[gcp] caused a downgrade of some other google-cloud packages:

 
{code:java}
...
 Installing collected packages: pydot, fastavro, pytz, google-cloud-core, 
google-cloud-bigquery, apache-beam, pysftp, google-cloud-firestore, msgpack, 
cachecontrol, firebase-admin, webob, bugsnag
 Found existing installation: pytz 2018.5
 Uninstalling pytz-2018.5:
 Successfully uninstalled pytz-2018.5
 Found existing installation: google-cloud-core 0.28.1
 Uninstalling google-cloud-core-0.28.1:
 Successfully uninstalled google-cloud-core-0.28.1
 Found existing installation: google-cloud-bigquery 1.5.0
 Uninstalling google-cloud-bigquery-1.5.0:
 Successfully uninstalled google-cloud-bigquery-1.5.0
 Found existing installation: apache-beam 2.5.0
 Uninstalling apache-beam-2.5.0:
 Successfully uninstalled apache-beam-2.5.0
 Successfully installed apache-beam-2.6.0 bugsnag-3.4.3 cachecontrol-0.12.5 
fastavro-0.19.7 firebase-admin-2.13.0 google-cloud-bigquery-0.25.0 
google-cloud-core-0.25.0 google-cloud-firestore-0.29.0 msgpack-0.5.6 
pydot-1.2.4 pysftp-0.2.9 pytz-2018.4 webob-1.8.2{code}
I tracked this down to the pinned requirement for bigquery: 
{{google-cloud-bigquery==0.25.0}}  
[https://github.com/apache/beam/blob/v2.6.0/sdks/python/setup.py#L140]

 

Which led to these pip warnings

 
{code:java}
$ pipdeptree --warn
Warning!!! Possibly conflicting dependencies found:
* google-cloud-bigquery==0.25.0
- google-cloud-core [required: <0.26dev,>=0.25.0, installed: 0.28.1]
* google-cloud-pubsub==0.26.0
- google-cloud-core [required: <0.26dev,>=0.25.0, installed: 0.28.1]
* google-cloud-dataflow==2.5.0
- apache-beam [required: ==2.5.0, installed: 2.6.0]
* pandas-gbq==0.6.0
- google-cloud-bigquery [required: >=0.32.0, installed: 0.25.0]{code}
 

 And the exception I was getting was from another google cloud storage module

 
{code:java}
File "/usr/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", 
line 535, in download_to_file
 ...
 File 
"/usr/local/lib/python2.7/site-packages/google/resumable_media/_helpers.py", 
line 146, in wait_and_retry 
 response = func() 
 File "/usr/local/lib/python2.7/site-packages/google_auth_httplib2.py", line 
198, in request 
 uri, method, body=body, headers=request_headers, **kwargs) 
 TypeError: request() got an unexpected keyword argument 'data'{code}
 I was able to work-around this issue by explicitly installing the desired 
versions of the google-cloud-core>=0.28.0 and google-cloud-bigquery>=1.5.0 
modules after the apache-beam[gcp]>=2.6.0 module.

 

 


was (Author: sjungwirth):
I just ran into this issue using Google's Cloud Composer (managed airflow) 
after adding the 2.6.0 (current latest) beam sdk pypy package 
(apache-beam[gcp]>=2.6.0). Looking at the build log, it looks like 
apache-beam[gcp] caused a downgrade of some other google-cloud packages:

 
{code:java}
...
 Installing collected packages: pydot, fastavro, pytz, google-cloud-core, 
google-cloud-bigquery, apache-beam, pysftp, google-cloud-firestore, msgpack, 
cachecontrol, firebase-admin, webob, bugsnag
 Found existing installation: pytz 2018.5
 Uninstalling pytz-2018.5:
 Successfully uninstalled pytz-2018.5
 Found existing installation: google-cloud-core 0.28.1
 Uninstalling google-cloud-core-0.28.1:
 Successfully uninstalled google-cloud-core-0.28.1
 Found existing installation: google-cloud-bigquery 1.5.0
 Uninstalling google-cloud-bigquery-1.5.0:
 Successfully uninstalled google-cloud-bigquery-1.5.0
 Found existing installation: apache-beam 2.5.0
 Uninstalling apache-beam-2.5.0:
 Successfully uninstalled apache-beam-2.5.0
 Successfully installed apache-beam-2.6.0 bugsnag-3.4.3 cachecontrol-0.12.5 
fastavro-0.19.7 firebase-admin-2.13.0 google-cloud-bigquery-0.25.0 
google-cloud-core-0.25.0 google-cloud-firestore-0.29.0 msgpack-0.5.6 
pydot-1.2.4 pysftp-0.2.9 pytz-2018.4 webob-1.8.2{code}

 I tracked this down to the pinned requirement for bigquery: 
{{google-cloud-bigquery==0.25.0}}  
[https://github.com/apache/beam/blob/v2.6.0/sdks/python/setup.py#L140]

 

Which led to these pip warnings

 
{code:java}
$ pipdeptree --warn
 Warning!!! Possibly conflicting dependencies found:

google-cloud-storage==1.10.0   google-cloud-core [required: <0.29dev,>=0.28.0, 
installed: 0.25.0]   google-cloud-firestore==0.29.0   google-cloud-core 
[required: <0.29dev,>=0.28.0, installed: 0.25.0]   pandas-gbq==0.6.0   
google-cloud-bigquery [required: >=0.32.0, installed: 0.25.0]   
google-cloud-dataflow==2.5.0   apache-beam [required: 

[jira] [Comment Edited] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

2018-09-13 Thread Scott Jungwirth (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614047#comment-16614047
 ] 

Scott Jungwirth edited comment on BEAM-3106 at 9/13/18 8:49 PM:


I just ran into this issue using Google's Cloud Composer (managed airflow) 
after adding the 2.6.0 (current latest) beam sdk pypy package 
(apache-beam[gcp]>=2.6.0). Looking at the build log, it looks like 
apache-beam[gcp] caused a downgrade of some other google-cloud packages:

 
{code:java}
...
 Installing collected packages: pydot, fastavro, pytz, google-cloud-core, 
google-cloud-bigquery, apache-beam, pysftp, google-cloud-firestore, msgpack, 
cachecontrol, firebase-admin, webob, bugsnag
 Found existing installation: pytz 2018.5
 Uninstalling pytz-2018.5:
 Successfully uninstalled pytz-2018.5
 Found existing installation: google-cloud-core 0.28.1
 Uninstalling google-cloud-core-0.28.1:
 Successfully uninstalled google-cloud-core-0.28.1
 Found existing installation: google-cloud-bigquery 1.5.0
 Uninstalling google-cloud-bigquery-1.5.0:
 Successfully uninstalled google-cloud-bigquery-1.5.0
 Found existing installation: apache-beam 2.5.0
 Uninstalling apache-beam-2.5.0:
 Successfully uninstalled apache-beam-2.5.0
 Successfully installed apache-beam-2.6.0 bugsnag-3.4.3 cachecontrol-0.12.5 
fastavro-0.19.7 firebase-admin-2.13.0 google-cloud-bigquery-0.25.0 
google-cloud-core-0.25.0 google-cloud-firestore-0.29.0 msgpack-0.5.6 
pydot-1.2.4 pysftp-0.2.9 pytz-2018.4 webob-1.8.2{code}

 I tracked this down to the pinned requirement for bigquery: 
{{google-cloud-bigquery==0.25.0}}  
[https://github.com/apache/beam/blob/v2.6.0/sdks/python/setup.py#L140]

 

Which led to these pip warnings

 
{code:java}
$ pipdeptree --warn
 Warning!!! Possibly conflicting dependencies found:

google-cloud-storage==1.10.0   google-cloud-core [required: <0.29dev,>=0.28.0, 
installed: 0.25.0]   google-cloud-firestore==0.29.0   google-cloud-core 
[required: <0.29dev,>=0.28.0, installed: 0.25.0]   pandas-gbq==0.6.0   
google-cloud-bigquery [required: >=0.32.0, installed: 0.25.0]   
google-cloud-dataflow==2.5.0   apache-beam [required: ==2.5.0, installed: 
2.6.0]   google-cloud-logging==1.6.0   google-cloud-core [required: 
<0.29dev,>=0.28.0, installed: 0.25.0] {code}
 

 

 And the exception I was getting was from another google cloud storage module

 
{code:java}
File "/usr/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", 
line 535, in download_to_file
 ...
 File 
"/usr/local/lib/python2.7/site-packages/google/resumable_media/_helpers.py", 
line 146, in wait_and_retry 
 response = func() 
 File "/usr/local/lib/python2.7/site-packages/google_auth_httplib2.py", line 
198, in request 
 uri, method, body=body, headers=request_headers, **kwargs) 
 TypeError: request() got an unexpected keyword argument 'data'{code}

  I was able to work-around this issue by explicitly installing the desired 
versions of the google-cloud-core>=0.28.0 and google-cloud-bigquery>=1.5.0 
modules after the apache-beam[gcp]>=2.6.0 module.

 

 


was (Author: sjungwirth):
I just ran into this issue using Google's Cloud Composer (managed airflow) 
after adding the 2.6.0 (current latest) beam sdk pypy package 
(apache-beam[gcp]>=2.6.0). Looking at the build log, it looks like 
apache-beam[gcp] caused a downgrade of some other google-cloud packages:
...
Installing collected packages: pydot, fastavro, pytz, google-cloud-core, 
google-cloud-bigquery, apache-beam, pysftp, google-cloud-firestore, msgpack, 
cachecontrol, firebase-admin, webob, bugsnag
Found existing installation: pytz 2018.5
Uninstalling pytz-2018.5:
Successfully uninstalled pytz-2018.5
Found existing installation: google-cloud-core 0.28.1
Uninstalling google-cloud-core-0.28.1:
Successfully uninstalled google-cloud-core-0.28.1
Found existing installation: google-cloud-bigquery 1.5.0
Uninstalling google-cloud-bigquery-1.5.0:
Successfully uninstalled google-cloud-bigquery-1.5.0
Found existing installation: apache-beam 2.5.0
Uninstalling apache-beam-2.5.0:
Successfully uninstalled apache-beam-2.5.0
Successfully installed apache-beam-2.6.0 bugsnag-3.4.3 cachecontrol-0.12.5 
fastavro-0.19.7 firebase-admin-2.13.0 google-cloud-bigquery-0.25.0 
google-cloud-core-0.25.0 google-cloud-firestore-0.29.0 msgpack-0.5.6 
pydot-1.2.4 pysftp-0.2.9 pytz-2018.4 webob-1.8.2
I tracked this down to the pinned requirement for bigquery: 
{{google-cloud-bigquery==0.25.0}}  
[https://github.com/apache/beam/blob/v2.6.0/sdks/python/setup.py#L140]

Which led to these pip warnings
$ pipdeptree --warn
Warning!!! Possibly conflicting dependencies found:
* google-cloud-storage==1.10.0
- google-cloud-core [required: <0.29dev,>=0.28.0, installed: 0.25.0]
* google-cloud-firestore==0.29.0
- google-cloud-core [required: <0.29dev,>=0.28.0, installed: 0.25.0]
* pandas-gbq==0.6.0
- google-cloud-bigquery [required: >=0.32.0, installed: 0.25.0]
* 

[jira] [Commented] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt

2018-09-13 Thread Scott Jungwirth (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614047#comment-16614047
 ] 

Scott Jungwirth commented on BEAM-3106:
---

I just ran into this issue using Google's Cloud Composer (managed airflow) 
after adding the 2.6.0 (current latest) beam sdk pypy package 
(apache-beam[gcp]>=2.6.0). Looking at the build log, it looks like 
apache-beam[gcp] caused a downgrade of some other google-cloud packages:
...
Installing collected packages: pydot, fastavro, pytz, google-cloud-core, 
google-cloud-bigquery, apache-beam, pysftp, google-cloud-firestore, msgpack, 
cachecontrol, firebase-admin, webob, bugsnag
Found existing installation: pytz 2018.5
Uninstalling pytz-2018.5:
Successfully uninstalled pytz-2018.5
Found existing installation: google-cloud-core 0.28.1
Uninstalling google-cloud-core-0.28.1:
Successfully uninstalled google-cloud-core-0.28.1
Found existing installation: google-cloud-bigquery 1.5.0
Uninstalling google-cloud-bigquery-1.5.0:
Successfully uninstalled google-cloud-bigquery-1.5.0
Found existing installation: apache-beam 2.5.0
Uninstalling apache-beam-2.5.0:
Successfully uninstalled apache-beam-2.5.0
Successfully installed apache-beam-2.6.0 bugsnag-3.4.3 cachecontrol-0.12.5 
fastavro-0.19.7 firebase-admin-2.13.0 google-cloud-bigquery-0.25.0 
google-cloud-core-0.25.0 google-cloud-firestore-0.29.0 msgpack-0.5.6 
pydot-1.2.4 pysftp-0.2.9 pytz-2018.4 webob-1.8.2
I tracked this down to the pinned requirement for bigquery: 
{{google-cloud-bigquery==0.25.0}}  
[https://github.com/apache/beam/blob/v2.6.0/sdks/python/setup.py#L140]

Which led to these pip warnings
$ pipdeptree --warn
Warning!!! Possibly conflicting dependencies found:
* google-cloud-storage==1.10.0
- google-cloud-core [required: <0.29dev,>=0.28.0, installed: 0.25.0]
* google-cloud-firestore==0.29.0
- google-cloud-core [required: <0.29dev,>=0.28.0, installed: 0.25.0]
* pandas-gbq==0.6.0
- google-cloud-bigquery [required: >=0.32.0, installed: 0.25.0]
* google-cloud-dataflow==2.5.0
- apache-beam [required: ==2.5.0, installed: 2.6.0]
* google-cloud-logging==1.6.0
- google-cloud-core [required: <0.29dev,>=0.28.0, installed: 0.25.0]
 And the exception I was getting was from another google cloud storage module
File "/usr/local/lib/python2.7/site-packages/google/cloud/storage/blob.py", 
line 535, in download_to_file
  ...
File 
"/usr/local/lib/python2.7/site-packages/google/resumable_media/_helpers.py", 
line 146, in wait_and_retry 
  response = func() 
File "/usr/local/lib/python2.7/site-packages/google_auth_httplib2.py", line 
198, in request 
  uri, method, body=body, headers=request_headers, **kwargs) 
TypeError: request() got an unexpected keyword argument 'data'
 

 I was able to work-around this issue by explicitly installing the desired 
versions of the google-cloud-core>=0.28.0 and google-cloud-bigquery>=1.5.0 
modules after the apache-beam[gcp]>=2.6.0 module.

 

 

> Consider not pinning all python dependencies, or moving them to 
> requirements.txt
> 
>
> Key: BEAM-3106
> URL: https://issues.apache.org/jira/browse/BEAM-3106
> Project: Beam
>  Issue Type: Wish
>  Components: build-system
>Affects Versions: 2.1.0
> Environment: python
>Reporter: Maximilian Roos
>Priority: Major
>
> Currently all python dependencies are [pinned or 
> capped|https://github.com/apache/beam/blob/master/sdks/python/setup.py#L97]
> While there's a good argument for supplying a `requirements.txt` with well 
> tested dependencies, having them specified in `setup.py` forces them to an 
> exact state on each install of Beam. This makes using Beam in any environment 
> with other libraries nigh on impossible. 
> This is particularly severe for the `gcp` dependencies, where we have 
> libraries that won't work with an older version (but Beam _does_ work with an 
> newer version). We have to do a bunch of gymnastics to get the correct 
> versions installed because of this. Unfortunately, airflow repeats this 
> practice and conflicts on a number of dependencies, adding further 
> complication (but, again there is no real conflict).
> I haven't seen this practice outside of the Apache & Google ecosystem - for 
> example no libraries in numerical python do this. Here's a [discussion on 
> SO|https://stackoverflow.com/questions/28509481/should-i-pin-my-python-dependencies-versions]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5383) Migrate integration tests for python bigquery io read

2018-09-13 Thread yifan zou (JIRA)
yifan zou created BEAM-5383:
---

 Summary: Migrate integration tests for python  bigquery io read 
 Key: BEAM-5383
 URL: https://issues.apache.org/jira/browse/BEAM-5383
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: yifan zou
Assignee: yifan zou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4704) String operations yield incorrect results when executed through SQL shell

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4704?focusedWorklogId=144067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144067
 ]

ASF GitHub Bot logged work on BEAM-4704:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:19
Start Date: 13/Sep/18 20:19
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6389: [BEAM-4704] Disable 
enumerable rules, use direct runner
URL: https://github.com/apache/beam/pull/6389#issuecomment-421139460
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144067)
Time Spent: 40m  (was: 0.5h)

> String operations yield incorrect results when executed through SQL shell
> -
>
> Key: BEAM-4704
> URL: https://issues.apache.org/jira/browse/BEAM-4704
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {{TRIM}} is defined to trim _all_ the characters in the first string from the 
> string-to-be-trimmed. Calcite has an incorrect implementation of this. We use 
> our own fixed implementation. But when executed through the SQL shell, the 
> results do not match what we get from the PTransform path. Here two test 
> cases that pass on {{master}} but are incorrect in the shell:
> {code:sql}
> BeamSQL> select TRIM(LEADING 'eh' FROM 'hehe__hehe');
> ++
> | EXPR$0 |
> ++
> | hehe__hehe |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(TRAILING 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(BOTH 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4704) String operations yield incorrect results when executed through SQL shell

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4704?focusedWorklogId=144066=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144066
 ]

ASF GitHub Bot logged work on BEAM-4704:


Author: ASF GitHub Bot
Created on: 13/Sep/18 20:18
Start Date: 13/Sep/18 20:18
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6389: [BEAM-4704] Disable 
enumerable rules, use direct runner
URL: https://github.com/apache/beam/pull/6389#issuecomment-421139420
 
 
   Tests stuck in flink.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144066)
Time Spent: 0.5h  (was: 20m)

> String operations yield incorrect results when executed through SQL shell
> -
>
> Key: BEAM-4704
> URL: https://issues.apache.org/jira/browse/BEAM-4704
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{TRIM}} is defined to trim _all_ the characters in the first string from the 
> string-to-be-trimmed. Calcite has an incorrect implementation of this. We use 
> our own fixed implementation. But when executed through the SQL shell, the 
> results do not match what we get from the PTransform path. Here two test 
> cases that pass on {{master}} but are incorrect in the shell:
> {code:sql}
> BeamSQL> select TRIM(LEADING 'eh' FROM 'hehe__hehe');
> ++
> | EXPR$0 |
> ++
> | hehe__hehe |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(TRAILING 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(BOTH 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5382) Combiner panics at runtime if MergeAccumulators has a context parameter

2018-09-13 Thread Cody Schroeder (JIRA)
Cody Schroeder created BEAM-5382:


 Summary: Combiner panics at runtime if MergeAccumulators has a 
context parameter
 Key: BEAM-5382
 URL: https://issues.apache.org/jira/browse/BEAM-5382
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Cody Schroeder
Assignee: Robert Burke


[combine.go#L62|https://github.com/apache/beam/blob/14ef23c/sdks/go/pkg/beam/core/runtime/exec/combine.go#L62]
 assumes that a combiner's {{MergeAccumulators}} function must be 2x1 but 
{{TryCombinePerKey}} accepts combiner functions with context parameters.  I 
believe accepting context parameters is the correct behavior overall.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs

2018-09-13 Thread Cody Schroeder (JIRA)
Cody Schroeder created BEAM-5381:


 Summary: Dataflow runner creates duplicate CoGBK step IDs
 Key: BEAM-5381
 URL: https://issues.apache.org/jira/browse/BEAM-5381
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Cody Schroeder
Assignee: Robert Burke


https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297

If the attached {{beam_dataflow_err.go}} pipeline is executed with the 
{{dataflow}} runner, GCP reports the following error:

{code}
Step with name e5 already exists. Duplicates are not allowed.
{code}

Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed 
duplicated.  If the CoGBK in the pipeline is not scoped, the duplication is 
fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-13 Thread Tomas Roos (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613956#comment-16613956
 ] 

Tomas Roos commented on BEAM-5378:
--

Thanks. I read the RFC its a good walk through. Implementing a a way to run the 
job on  with the universal local runner sounds like a great improvement for 
developers indeed. And making sure examples are running as expected :-)

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>
> I've been spending a day or so running through the example available for the 
> Go SDK in order to see what works and on what runner (direct, dataflow), and 
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache 
> beam and dataflow it would be a tremendous value to have all examples running 
> because many of them have legitimate use-cases behind them. 
> {code:java}
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> ├── contains
> │   └── contains.go
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> │   ├── filter
> │   │   └── filter.go
> │   ├── join
> │   │   └── join.go
> │   ├── max
> │   │   └── max.go
> │   └── tornadoes
> │   └── tornadoes.go
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> ├── forest
> │   └── forest.go
> ├── grades
> │   └── grades.go
> ├── minimal_wordcount
> │   └── minimal_wordcount.go
> ├── multiout
> │   └── multiout.go
> ├── pingpong
> │   └── pingpong.go
> ├── streaming_wordcap
> │   └── wordcap.go
> ├── windowed_wordcount
> │   └── windowed_wordcount.go
> ├── wordcap
> │   └── wordcap.go
> ├── wordcount
> │   └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not 
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not 
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it 
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files 
> from gs://
> But when running this it yields a error from the top.Largest func (discussed 
> in another issue that top.Largest needs to have a serializeable combinator / 
> accumulator)
> ➜  autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019 
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── contains
> │   └── contains.go
> Fails when running debug.Head for some mysterious reason, might have to do 
> with the param passing into the x,y iterator. Frankly I dont know and could 
> not figure.
> But removing the debug.Head call everything works as expected and succeeds.
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> Fails because of extractFn which is a struct is not registered through the 
> beam.RegisterType (is this a must or not?)
> It works as a work around at least
> ➜  combine git:(master) ✗ ./combine 
> --output=fair-app-213019:combineoutput.test --project=fair-app-213019 
> --runner=dataflow --staging_location=gs://203019-staging/ 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:40:50 Running combine
> panic: Failed to serialize 3: ParDo [In(Main): main.WordRow <- {2: 
> main.WordRow/main.WordRow[json] GLO}] -> [Out: KV -> {3: 
> KV/KV GLO}]: encode: bad userfn: recv type must 
> be registered: *main.extractFn
> │   ├── filter
> │   │   └── filter.go
> Fails go-job-1-1536673624017210012
> 2018-09-11 (15:47:13) Output i0 for step was not found. 
> │   ├── join
> │   │   └── join.go
> Working as expected! Whey!
> │   ├── max
> │   │   └── max.go
> Working!
> │   └── tornadoes
> │   └── tornadoes.go
> Working!
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> Runs on direct runner but at dataflow this fails with  
> go-job-1-1536840754314770217
> Workflow failed. Causes: 
> 

[jira] [Work logged] (BEAM-3711) Implement portable Combiner lifting in Dataflow Runner

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3711?focusedWorklogId=144053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144053
 ]

ASF GitHub Bot logged work on BEAM-3711:


Author: ASF GitHub Bot
Created on: 13/Sep/18 19:05
Start Date: 13/Sep/18 19:05
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #6384: [BEAM-3711] Bug fix 
and improvements in Dataflow transform override.
URL: https://github.com/apache/beam/pull/6384#issuecomment-421117235
 
 
   R: @HuangLED


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144053)
Time Spent: 1h  (was: 50m)

> Implement portable Combiner lifting in Dataflow Runner
> --
>
> Key: BEAM-3711
> URL: https://issues.apache.org/jira/browse/BEAM-3711
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Labels: portability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This is the task to make sure that once other parts of portable Combiner 
> lifting are complete, that the actual Combiner lifting can be done in the 
> Dataflow Runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5380) [beam_PostCommit_Go_GradleBuild ][Flake] Flakes due to Gradle parallelization

2018-09-13 Thread Mikhail Gryzykhin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613925#comment-16613925
 ] 

Mikhail Gryzykhin commented on BEAM-5380:
-

https://github.com/apache/beam/pull/6390

> [beam_PostCommit_Go_GradleBuild ][Flake] Flakes due to Gradle parallelization
> -
>
> Key: BEAM-5380
> URL: https://issues.apache.org/jira/browse/BEAM-5380
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>
> Synced with herohde, 
> Seems that the job fails to build tests properly. Suspect is parallelization.
> Disabling parallelization for now. Will monitor for couple of days for repro.
> Failing job url:
> https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/919/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5380) [beam_PostCommit_Go_GradleBuild ][Flake] Flakes due to Gradle parallelization

2018-09-13 Thread Mikhail Gryzykhin (JIRA)
Mikhail Gryzykhin created BEAM-5380:
---

 Summary: [beam_PostCommit_Go_GradleBuild ][Flake] Flakes due to 
Gradle parallelization
 Key: BEAM-5380
 URL: https://issues.apache.org/jira/browse/BEAM-5380
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Mikhail Gryzykhin
Assignee: Mikhail Gryzykhin


Synced with herohde, 

Seems that the job fails to build tests properly. Suspect is parallelization.

Disabling parallelization for now. Will monitor for couple of days for repro.

Failing job url:

https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/919/consoleFull



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3446) RedisIO non-prefix read operations

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3446?focusedWorklogId=144048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144048
 ]

ASF GitHub Bot logged work on BEAM-3446:


Author: ASF GitHub Bot
Created on: 13/Sep/18 18:36
Start Date: 13/Sep/18 18:36
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #5841: [BEAM-3446] Fixes 
RedisIO non-prefix read operations
URL: https://github.com/apache/beam/pull/5841#issuecomment-421108852
 
 
   @huygaa11 sorry, I forgot. Resuming my review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144048)
Time Spent: 5h 10m  (was: 5h)

> RedisIO non-prefix read operations
> --
>
> Key: BEAM-3446
> URL: https://issues.apache.org/jira/browse/BEAM-3446
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-redis
>Reporter: Vinay varma
>Assignee: Vinay varma
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Read operation in RedisIO is for prefix based look ups. While this can be 
> used for exact key matches as well, the number of operations limits the 
> through put of the function.
> I suggest exposing current readAll operation as readbyprefix and using more 
> simpler operations for readAll functionality.
> ex:
> {code:java}
> String output = jedis.get(element);
> if (output != null) {
> processContext.output(KV.of(element, output));
> }
> {code}
> instead of:
> https://github.com/apache/beam/blob/7d240c0bb171af6868f1a6e95196c9dcfc9ac640/sdks/java/io/redis/src/main/java/org/apache/beam/sdk/io/redis/RedisIO.java#L292



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Py_VR_Dataflow #1031

2018-09-13 Thread Apache Jenkins Server
See 




[beam] 01/01: Disable build parallelization for Go Post-commit tests due to flakiness

2018-09-13 Thread herohde
This is an automated email from the ASF dual-hosted git repository.

herohde pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 3579deb624c8570077baa4fef771f4a355585c63
Merge: 14cef9c 25f8d1a
Author: Henning Rohde 
AuthorDate: Thu Sep 13 11:25:05 2018 -0700

Disable build parallelization for Go Post-commit tests due to flakiness

 .test-infra/jenkins/job_PostCommit_Go_GradleBuild.groovy | 1 +
 1 file changed, 1 insertion(+)



[beam] branch master updated (14cef9c -> 3579deb)

2018-09-13 Thread herohde
This is an automated email from the ASF dual-hosted git repository.

herohde pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 14cef9c  Merge pull request #6387 from lostluck/patch-12
 add 25f8d1a  Disable build parallelization for Go due to flakiness
 new 3579deb  Disable build parallelization for Go Post-commit tests due to 
flakiness

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .test-infra/jenkins/job_PostCommit_Go_GradleBuild.groovy | 1 +
 1 file changed, 1 insertion(+)



Build failed in Jenkins: beam_PerformanceTests_Python #1433

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[github] Add getting started instructions for go-sdk

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam15 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 14cef9c55a9daaac74d560ee1b2ca6d384cfcc18 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 14cef9c55a9daaac74d560ee1b2ca6d384cfcc18
Commit message: "Merge pull request #6387 from lostluck/patch-12"
 > git rev-list --no-walk 41a1362277691746b9828ba5637e08f2c6ec6d31 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6411974113748124148.sh
+ rm -rf 

[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins446154703800614655.sh
+ rm -rf 

[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4501586126417594996.sh
+ virtualenv 

New python executable in 

Also creating executable in 

Installing setuptools, pkg_resources, pip, wheel...done.
Running virtualenv with interpreter /usr/bin/python2
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4612298685173843980.sh
+ 

 install --upgrade setuptools pip
Requirement already up-to-date: setuptools in 
./env/.perfkit_env/lib/python2.7/site-packages (40.2.0)
Requirement already up-to-date: pip in 
./env/.perfkit_env/lib/python2.7/site-packages (18.0)
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4267521145537541902.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git 

Cloning into 
'
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8828159073211822886.sh
+ 

 install -r 

Collecting absl-py (from -r 

 (line 14))
Collecting jinja2>=2.7 (from -r 

 (line 15))
  Using cached 
https://files.pythonhosted.org/packages/7f/ff/ae64bacdfc95f27a016a7bed8e8686763ba4d277a78ca76f32659220a731/Jinja2-2.10-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./env/.perfkit_env/lib/python2.7/site-packages (from -r 

 (line 16)) (40.2.0)
Collecting colorlog[windows]==2.6.0 (from -r 

 (line 17))
  Using cached 
https://files.pythonhosted.org/packages/59/1a/46a1bf2044ad8b30b52fed0f389338c85747e093fe7f51a567f4cb525892/colorlog-2.6.0-py2.py3-none-any.whl
Collecting blinker>=1.3 (from -r 

 (line 18))
Collecting futures>=3.0.3 (from -r 

 (line 19))
  Using cached 
https://files.pythonhosted.org/packages/2d/99/b2c4e9d5a30f6471e410a146232b4118e697fa3ffc06d6a65efde84debd0/futures-3.2.0-py2-none-any.whl
Collecting PyYAML==3.12 (from -r 

[jira] [Work logged] (BEAM-4684) Support @RequiresStableInput on Dataflow runner in Java SDK

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4684?focusedWorklogId=144043=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144043
 ]

ASF GitHub Bot logged work on BEAM-4684:


Author: ASF GitHub Bot
Created on: 13/Sep/18 18:12
Start Date: 13/Sep/18 18:12
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on a change in pull request #6220: 
[BEAM-4684] Add integration test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r217483672
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/FilePatternMatchingShardedFile.java
 ##
 @@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.util;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Strings;
+import com.google.common.collect.Lists;
+import com.google.common.io.CharStreams;
+import java.io.IOException;
+import java.io.Reader;
+import java.nio.channels.Channels;
+import java.nio.charset.StandardCharsets;
+import java.util.Collection;
+import java.util.List;
+import java.util.stream.Collectors;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MatchResult.Metadata;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.joda.time.Duration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A sharded file which matches a given file pattern.
+ *
+ * Note that file matching should only occur once the file system is in a 
stable state and
+ * guaranteed to provide a consistent result during file pattern matching.
+ */
+public class FilePatternMatchingShardedFile implements ShardedFile {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(FilePatternMatchingShardedFile.class);
+
+  private static final int MAX_READ_RETRIES = 4;
+  private static final Duration DEFAULT_SLEEP_DURATION = 
Duration.standardSeconds(10L);
+  static final FluentBackoff BACK_OFF_FACTORY =
+  FluentBackoff.DEFAULT
+  .withInitialBackoff(DEFAULT_SLEEP_DURATION)
+  .withMaxRetries(MAX_READ_RETRIES);
+
+  private final String filePattern;
+
+  /**
+   * Constructs an {@link FilePatternMatchingShardedFile} for the given file 
pattern.
+   *
+   * Note that file matching should only occur once the file system is in a 
stable state and
+   * guaranteed to provide a consistent result during file pattern matching.
+   */
+  public FilePatternMatchingShardedFile(String filePattern) {
+checkArgument(
+!Strings.isNullOrEmpty(filePattern),
+"Expected valid file path, but received %s",
+filePattern);
+this.filePattern = filePattern;
+  }
+
+  @Override
+  public List readFilesWithRetries(Sleeper sleeper, BackOff backOff)
+  throws IOException, InterruptedException {
+IOException lastException = null;
+
+do {
+  try {
+Collection files = FileSystems.match(filePattern).metadata();
+LOG.debug(
+"Found file(s) {} by matching the path: {}",
+files
+.stream()
+.map(Metadata::resourceId)
+.map(ResourceId::getFilename)
+.collect(Collectors.joining(",")),
+filePattern);
+if (files.isEmpty()) {
+  continue;
+}
+// Read data from file paths
+return readLines(files);
+  } catch (IOException e) {
+// Ignore and retry
+lastException = e;
+LOG.warn("Error in file reading. Ignore and retry.");
+  }
+} while (BackOffUtils.next(sleeper, backOff));
+// Failed after max retries
+throw new IOException(
+String.format("Unable to read file(s) after retrying %d times", 
MAX_READ_RETRIES),
+lastException);
+  }
+
+  /** Discovers all shards of this file using the provided {@link Sleeper} and 
{@link BackOff}. */
 
 Review comment:
   I see. Thanks for catching this. Will fix it along with other errors in a 
following PR.


[jira] [Work logged] (BEAM-5365) Migrate integration tests for bigquery_tornadoes

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5365?focusedWorklogId=144040=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144040
 ]

ASF GitHub Bot logged work on BEAM-5365:


Author: ASF GitHub Bot
Created on: 13/Sep/18 18:05
Start Date: 13/Sep/18 18:05
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6380: Do Not Merge, 
[BEAM-5365] adding more tests in bigquery_tornadoes_it
URL: https://github.com/apache/beam/pull/6380#issuecomment-421098998
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144040)
Time Spent: 1h 10m  (was: 1h)

> Migrate integration tests for bigquery_tornadoes
> 
>
> Key: BEAM-5365
> URL: https://issues.apache.org/jira/browse/BEAM-5365
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4684) Support @RequiresStableInput on Dataflow runner in Java SDK

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4684?focusedWorklogId=144039=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144039
 ]

ASF GitHub Bot logged work on BEAM-4684:


Author: ASF GitHub Bot
Created on: 13/Sep/18 18:02
Start Date: 13/Sep/18 18:02
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #6220: 
[BEAM-4684] Add integration test for support of @RequiresStableInput
URL: https://github.com/apache/beam/pull/6220#discussion_r217477536
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/util/FilePatternMatchingShardedFile.java
 ##
 @@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.util;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Strings;
+import com.google.common.collect.Lists;
+import com.google.common.io.CharStreams;
+import java.io.IOException;
+import java.io.Reader;
+import java.nio.channels.Channels;
+import java.nio.charset.StandardCharsets;
+import java.util.Collection;
+import java.util.List;
+import java.util.stream.Collectors;
+import org.apache.beam.sdk.io.FileSystems;
+import org.apache.beam.sdk.io.fs.MatchResult.Metadata;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.joda.time.Duration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A sharded file which matches a given file pattern.
+ *
+ * Note that file matching should only occur once the file system is in a 
stable state and
+ * guaranteed to provide a consistent result during file pattern matching.
+ */
+public class FilePatternMatchingShardedFile implements ShardedFile {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(FilePatternMatchingShardedFile.class);
+
+  private static final int MAX_READ_RETRIES = 4;
+  private static final Duration DEFAULT_SLEEP_DURATION = 
Duration.standardSeconds(10L);
+  static final FluentBackoff BACK_OFF_FACTORY =
+  FluentBackoff.DEFAULT
+  .withInitialBackoff(DEFAULT_SLEEP_DURATION)
+  .withMaxRetries(MAX_READ_RETRIES);
+
+  private final String filePattern;
+
+  /**
+   * Constructs an {@link FilePatternMatchingShardedFile} for the given file 
pattern.
+   *
+   * Note that file matching should only occur once the file system is in a 
stable state and
+   * guaranteed to provide a consistent result during file pattern matching.
+   */
+  public FilePatternMatchingShardedFile(String filePattern) {
+checkArgument(
+!Strings.isNullOrEmpty(filePattern),
+"Expected valid file path, but received %s",
+filePattern);
+this.filePattern = filePattern;
+  }
+
+  @Override
+  public List readFilesWithRetries(Sleeper sleeper, BackOff backOff)
+  throws IOException, InterruptedException {
+IOException lastException = null;
+
+do {
+  try {
+Collection files = FileSystems.match(filePattern).metadata();
+LOG.debug(
+"Found file(s) {} by matching the path: {}",
+files
+.stream()
+.map(Metadata::resourceId)
+.map(ResourceId::getFilename)
+.collect(Collectors.joining(",")),
+filePattern);
+if (files.isEmpty()) {
+  continue;
+}
+// Read data from file paths
+return readLines(files);
+  } catch (IOException e) {
+// Ignore and retry
+lastException = e;
+LOG.warn("Error in file reading. Ignore and retry.");
+  }
+} while (BackOffUtils.next(sleeper, backOff));
+// Failed after max retries
+throw new IOException(
+String.format("Unable to read file(s) after retrying %d times", 
MAX_READ_RETRIES),
+lastException);
+  }
+
+  /** Discovers all shards of this file using the provided {@link Sleeper} and 
{@link BackOff}. */
 
 Review comment:
   This comment doesn't make sense, please fix up in a following PR.
   
   It looks like its a common error in the other implementations.


Build failed in Jenkins: beam_PreCommit_Website_Cron #55

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[github] Add getting started instructions for go-sdk

--
[...truncated 8.75 KB...]
> Task :buildSrc:spotlessGroovy
file or directory 
'
 not found
file or directory 
'
 not found
file or directory 
'
 not found
Caching disabled for task ':buildSrc:spotlessGroovy': Caching has not been 
enabled for the task
Task ':buildSrc:spotlessGroovy' is not up-to-date because:
  No history is available.
All input files are considered out-of-date for incremental task 
':buildSrc:spotlessGroovy'.
file or directory 
'
 not found
:spotlessGroovy (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
completed. Took 1.451 secs.
:spotlessGroovyCheck (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
started.

> Task :buildSrc:spotlessGroovyCheck
Skipping task ':buildSrc:spotlessGroovyCheck' as it has no actions.
:spotlessGroovyCheck (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
completed. Took 0.0 secs.
:spotlessGroovyGradle (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
started.

> Task :buildSrc:spotlessGroovyGradle
Caching disabled for task ':buildSrc:spotlessGroovyGradle': Caching has not 
been enabled for the task
Task ':buildSrc:spotlessGroovyGradle' is not up-to-date because:
  No history is available.
All input files are considered out-of-date for incremental task 
':buildSrc:spotlessGroovyGradle'.
:spotlessGroovyGradle (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
completed. Took 0.03 secs.
:spotlessGroovyGradleCheck (Thread[Task worker for ':buildSrc' Thread 
8,5,main]) started.

> Task :buildSrc:spotlessGroovyGradleCheck
Skipping task ':buildSrc:spotlessGroovyGradleCheck' as it has no actions.
:spotlessGroovyGradleCheck (Thread[Task worker for ':buildSrc' Thread 
8,5,main]) completed. Took 0.0 secs.
:spotlessCheck (Thread[Task worker for ':buildSrc' Thread 8,5,main]) started.

> Task :buildSrc:spotlessCheck
Skipping task ':buildSrc:spotlessCheck' as it has no actions.
:spotlessCheck (Thread[Task worker for ':buildSrc' Thread 8,5,main]) completed. 
Took 0.0 secs.
:compileTestJava (Thread[Task worker for ':buildSrc' Thread 8,5,main]) started.

> Task :buildSrc:compileTestJava NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileTestJava' as it has no source files and no 
previous output files.
:compileTestJava (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
completed. Took 0.002 secs.
:compileTestGroovy (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
started.

> Task :buildSrc:compileTestGroovy NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:compileTestGroovy' as it has no source files and no 
previous output files.
:compileTestGroovy (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
completed. Took 0.004 secs.
:processTestResources (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
started.

> Task :buildSrc:processTestResources NO-SOURCE
file or directory 
'
 not found
Skipping task ':buildSrc:processTestResources' as it has no source files and no 
previous output files.
:processTestResources (Thread[Task worker for ':buildSrc' Thread 8,5,main]) 
completed. Took 0.002 secs.
:testClasses (Thread[Task worker for ':buildSrc' Thread 8,5,main]) started.

> Task :buildSrc:testClasses UP-TO-DATE
Skipping task ':buildSrc:testClasses' as it has no actions.
:testClasses (Thread[Task worker for ':buildSrc' Thread 8,5,main]) completed. 
Took 0.0 secs.
:test (Thread[Task worker for ':buildSrc' Thread 8,5,main]) started.

> Task :buildSrc:test NO-SOURCE
Skipping task ':buildSrc:test' as it has no source files and no previous output 
files.
:test (Thread[Task worker for ':buildSrc' Thread 8,5,main]) completed. Took 
0.004 secs.
:check (Thread[Task worker for ':buildSrc' Thread 8,5,main]) started.

> Task :buildSrc:check
Skipping task ':buildSrc:check' as it has no actions.
:check (Thread[Task worker for ':buildSrc' Thread 8,5,main]) completed. Took 
0.0 secs.
:build (Thread[Task worker for ':buildSrc' Thread 8,5,main]) started.

> Task :buildSrc:build
Skipping task ':buildSrc:build' as it has no actions.
:build (Thread[Task worker for ':buildSrc' Thread 8,5,main]) completed. Took 
0.0 secs.
Build cache (/home/jenkins/.gradle/caches/build-cache-1) 

[jira] [Work logged] (BEAM-4704) String operations yield incorrect results when executed through SQL shell

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4704?focusedWorklogId=144036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144036
 ]

ASF GitHub Bot logged work on BEAM-4704:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:57
Start Date: 13/Sep/18 17:57
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #6389: [BEAM-4704] Disable 
enumerable rules, use direct runner
URL: https://github.com/apache/beam/pull/6389#issuecomment-421096733
 
 
   R: @akedin 
   cc: @amaliujia 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144036)
Time Spent: 20m  (was: 10m)

> String operations yield incorrect results when executed through SQL shell
> -
>
> Key: BEAM-4704
> URL: https://issues.apache.org/jira/browse/BEAM-4704
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{TRIM}} is defined to trim _all_ the characters in the first string from the 
> string-to-be-trimmed. Calcite has an incorrect implementation of this. We use 
> our own fixed implementation. But when executed through the SQL shell, the 
> results do not match what we get from the PTransform path. Here two test 
> cases that pass on {{master}} but are incorrect in the shell:
> {code:sql}
> BeamSQL> select TRIM(LEADING 'eh' FROM 'hehe__hehe');
> ++
> | EXPR$0 |
> ++
> | hehe__hehe |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(TRAILING 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}
> {code:sql}
> BeamSQL> select TRIM(BOTH 'eh' FROM 'hehe__hehe');
> ++
> |   EXPR$0   |
> ++
> | hehe__heh  |
> ++
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4704) String operations yield incorrect results when executed through SQL shell

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4704?focusedWorklogId=144035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144035
 ]

ASF GitHub Bot logged work on BEAM-4704:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:55
Start Date: 13/Sep/18 17:55
Worklog Time Spent: 10m 
  Work Description: apilloud opened a new pull request #6389: [BEAM-4704] 
Disable enumerable rules, use direct runner
URL: https://github.com/apache/beam/pull/6389
 
 
   This keeps everything going through a single path. We can add back 
enumerable value rules later if we want.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144035)
Time Spent: 10m
Remaining Estimate: 0h

> String operations yield incorrect results when executed through SQL shell
> -
>
> Key: BEAM-4704
> URL: https://issues.apache.org/jira/browse/BEAM-4704
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>

[jira] [Work logged] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5337?focusedWorklogId=144031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144031
 ]

ASF GitHub Bot logged work on BEAM-5337:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:49
Start Date: 13/Sep/18 17:49
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6385: 
[BEAM-5337] Fix flaky test UnboundedSourceWrapperTest#testValueEmission
URL: https://github.com/apache/beam/pull/6385#discussion_r217476414
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/UnboundedSourceWrapperTest.java
 ##
 @@ -134,11 +133,13 @@ public void testValueEmission() throws Exception {
   public void run() {
 while (true) {
   try {
-synchronized (testHarness.getCheckpointLock()) {
-  
testHarness.setProcessingTime(System.currentTimeMillis());
-}
+testHarness.setProcessingTime(System.currentTimeMillis());
 Thread.sleep(1000);
 
 Review comment:
   Whenever processing time is advanced, the watermark timers are triggered, so 
we need to keep it advancing. The sleep releases the checkpoint lock which 
would otherwise prevent the the source wrapper from making progress.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144031)
Time Spent: 50m  (was: 40m)

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Py_VR_Dataflow #1030

2018-09-13 Thread Apache Jenkins Server
See 


Changes:

[github] Add getting started instructions for go-sdk

--
[...truncated 74.14 KB...]
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.2.0.zip
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully 

[jira] [Work logged] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5337?focusedWorklogId=144028=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144028
 ]

ASF GitHub Bot logged work on BEAM-5337:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:42
Start Date: 13/Sep/18 17:42
Worklog Time Spent: 10m 
  Work Description: mxm commented on a change in pull request #6385: 
[BEAM-5337] Fix flaky test UnboundedSourceWrapperTest#testValueEmission
URL: https://github.com/apache/beam/pull/6385#discussion_r217474178
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/UnboundedSourceWrapperTest.java
 ##
 @@ -134,11 +133,13 @@ public void testValueEmission() throws Exception {
   public void run() {
 while (true) {
   try {
-synchronized (testHarness.getCheckpointLock()) {
-  
testHarness.setProcessingTime(System.currentTimeMillis());
-}
+testHarness.setProcessingTime(System.currentTimeMillis());
 Thread.sleep(1000);
+  } catch (InterruptedException e) {
+// this is ok
+break;
   } catch (Exception e) {
+e.printStackTrace();
 
 Review comment:
   It is logged to the screen. Rethrowing in a thread would be possible but it 
doesn't let the test case fail.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144028)
Time Spent: 40m  (was: 0.5h)

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5376) Row interface doesn't support nullability on all fields.

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5376?focusedWorklogId=144027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144027
 ]

ASF GitHub Bot logged work on BEAM-5376:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:38
Start Date: 13/Sep/18 17:38
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6383: 
[BEAM-5376] Support nullability on all Row types
URL: https://github.com/apache/beam/pull/6383#discussion_r217472409
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaBeanSchemaTest.java
 ##
 @@ -155,10 +155,10 @@ public void testRecursiveGetters() throws 
NoSuchSchemaException {
 
 Row nestedRow = row.getRow("nested");
 assertEquals("string", nestedRow.getString("str"));
-assertEquals((byte) 1, nestedRow.getByte("aByte"));
-assertEquals((short) 2, nestedRow.getInt16("aShort"));
-assertEquals((int) 3, nestedRow.getInt32("anInt"));
-assertEquals((long) 4, nestedRow.getInt64("aLong"));
+assertEquals((byte) 1, (Object) nestedRow.getByte("aByte"));
 
 Review comment:
   My intention was to make the code more understandable for readers. Because 
if we do explicitly casting and leave less guess room for readers, readers will 
know the same type of values are compared, and they do not need to question the 
correctness because of implicitly casting somewhere. At least, readers do not 
need to search  how `assertEquals` is implemented for different types. From 
this perspective, `assertEquals(Byte.valueof((byte) 1), 
nestedRow.getByte("aByte"))` is ugly, but necessary.
   
   After I read the code, seems like the family of `assertEquals` in JUnit is 
simpler. `assertEquals(Byte.valueof((byte) 1), nestedRow.getByte("aByte"))` 
will be converted to `assertEquals(object, object)`, and 
`object.equals(object)` will be called. In `Byte` implementation, as Anton 
pointed out, `Byte.equals()` accepts `Object` and does a casting anyway. So 
from implementation perspective, there is no difference among approaches in 
this discussion.
   
   Since there isn't a perfect way to make the code clear, I am ok with either 
the original way in this PR, or other discussed way to implement it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144027)
Time Spent: 2h 20m  (was: 2h 10m)

> Row interface doesn't support nullability on all fields.
> 
>
> Key: BEAM-5376
> URL: https://issues.apache.org/jira/browse/BEAM-5376
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> For example: 
> {code:java}
> public boolean getBoolean(int idx);{code}
> Should be:
> {code:java}
> public Boolean getBoolean(int idx);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5337?focusedWorklogId=144024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144024
 ]

ASF GitHub Bot logged work on BEAM-5337:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:37
Start Date: 13/Sep/18 17:37
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6385: 
[BEAM-5337] Fix flaky test UnboundedSourceWrapperTest#testValueEmission
URL: https://github.com/apache/beam/pull/6385#discussion_r217472537
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/UnboundedSourceWrapperTest.java
 ##
 @@ -134,11 +133,13 @@ public void testValueEmission() throws Exception {
   public void run() {
 while (true) {
   try {
-synchronized (testHarness.getCheckpointLock()) {
-  
testHarness.setProcessingTime(System.currentTimeMillis());
-}
+testHarness.setProcessingTime(System.currentTimeMillis());
 Thread.sleep(1000);
 
 Review comment:
   Did you find why this sleep is here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144024)
Time Spent: 0.5h  (was: 20m)

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5376) Row interface doesn't support nullability on all fields.

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5376?focusedWorklogId=144025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144025
 ]

ASF GitHub Bot logged work on BEAM-5376:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:37
Start Date: 13/Sep/18 17:37
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #6383: [BEAM-5376] Support 
nullability on all Row types
URL: https://github.com/apache/beam/pull/6383#issuecomment-421090162
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144025)
Time Spent: 2h 10m  (was: 2h)

> Row interface doesn't support nullability on all fields.
> 
>
> Key: BEAM-5376
> URL: https://issues.apache.org/jira/browse/BEAM-5376
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> For example: 
> {code:java}
> public boolean getBoolean(int idx);{code}
> Should be:
> {code:java}
> public Boolean getBoolean(int idx);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5376) Row interface doesn't support nullability on all fields.

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5376?focusedWorklogId=144023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144023
 ]

ASF GitHub Bot logged work on BEAM-5376:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:36
Start Date: 13/Sep/18 17:36
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6383: 
[BEAM-5376] Support nullability on all Row types
URL: https://github.com/apache/beam/pull/6383#discussion_r217472409
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaBeanSchemaTest.java
 ##
 @@ -155,10 +155,10 @@ public void testRecursiveGetters() throws 
NoSuchSchemaException {
 
 Row nestedRow = row.getRow("nested");
 assertEquals("string", nestedRow.getString("str"));
-assertEquals((byte) 1, nestedRow.getByte("aByte"));
-assertEquals((short) 2, nestedRow.getInt16("aShort"));
-assertEquals((int) 3, nestedRow.getInt32("anInt"));
-assertEquals((long) 4, nestedRow.getInt64("aLong"));
+assertEquals((byte) 1, (Object) nestedRow.getByte("aByte"));
 
 Review comment:
   My intention was to make the code more understandable for readers. Because 
if we do explicitly casting and leave less guess room for readers, readers will 
know the same type of values are compared, and they do not need to question the 
correctness because of implicitly casting somewhere. At least, readers do not 
need to search  how `assertEquals` is implemented for different types. From 
this perspective, `assertEquals(Byte.valueof((byte) 1), 
nestedRow.getByte("aByte"))` is ugly, but necessary.
   
   After I read the code, seems like the family of `assertEquals` in JUnit is 
simpler. `assertEquals(Byte.valueof((byte) 1), nestedRow.getByte("aByte"))` 
will be converted `assertEquals(object, object)`, and `object.equals(object)` 
will be called. In `Byte` implementation, as Anton pointed out, `Byte.equals()` 
accepts `Object` and does a casting anyway. So from implementation perspective, 
there is no difference among approaches in this discussion.
   
   Since there isn't a perfect way to make the code clear, I am ok with either 
the original way in this PR, or other discussed way to implement it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144023)
Time Spent: 2h  (was: 1h 50m)

> Row interface doesn't support nullability on all fields.
> 
>
> Key: BEAM-5376
> URL: https://issues.apache.org/jira/browse/BEAM-5376
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> For example: 
> {code:java}
> public boolean getBoolean(int idx);{code}
> Should be:
> {code:java}
> public Boolean getBoolean(int idx);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5376) Row interface doesn't support nullability on all fields.

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5376?focusedWorklogId=144022=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144022
 ]

ASF GitHub Bot logged work on BEAM-5376:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:36
Start Date: 13/Sep/18 17:36
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6383: 
[BEAM-5376] Support nullability on all Row types
URL: https://github.com/apache/beam/pull/6383#discussion_r217472409
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaBeanSchemaTest.java
 ##
 @@ -155,10 +155,10 @@ public void testRecursiveGetters() throws 
NoSuchSchemaException {
 
 Row nestedRow = row.getRow("nested");
 assertEquals("string", nestedRow.getString("str"));
-assertEquals((byte) 1, nestedRow.getByte("aByte"));
-assertEquals((short) 2, nestedRow.getInt16("aShort"));
-assertEquals((int) 3, nestedRow.getInt32("anInt"));
-assertEquals((long) 4, nestedRow.getInt64("aLong"));
+assertEquals((byte) 1, (Object) nestedRow.getByte("aByte"));
 
 Review comment:
   My intention was to make the code more understandable for readers. Because 
if we do explicitly casting and leave less guess room for readers, readers will 
know the same type of values are compared, and they do not need to question the 
correctness because of implicitly casting somewhere. At least, readers do not 
need to search  how `assertEquals` is implemented for different types. From 
this perspective, `assertEquals(Byte.valueof((byte) 1), 
nestedRow.getByte("aByte"))` is ugly, but necessary.
   
   After I read the code, seems like the family of `assertEquals` in JUnit is 
simpler. `assertEquals(Byte.valueof((byte) 1), nestedRow.getByte("aByte"))` 
will be converted `assertEquals(object, object)`, and `object.equals(object)` 
will be called. In `Byte` implementation, as Anton pointed out, `Byte.equals()` 
accepts `Object` and does a casting anyway. So from implementation perspective, 
there is no difference among approaches in this discussion.
   
   Since there isn't a perfect way to make the code clear, so I am ok with 
either the original way in this PR, or other discussed way to implement it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144022)
Time Spent: 1h 50m  (was: 1h 40m)

> Row interface doesn't support nullability on all fields.
> 
>
> Key: BEAM-5376
> URL: https://issues.apache.org/jira/browse/BEAM-5376
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> For example: 
> {code:java}
> public boolean getBoolean(int idx);{code}
> Should be:
> {code:java}
> public Boolean getBoolean(int idx);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5337?focusedWorklogId=144021=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144021
 ]

ASF GitHub Bot logged work on BEAM-5337:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:36
Start Date: 13/Sep/18 17:36
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6385: 
[BEAM-5337] Fix flaky test UnboundedSourceWrapperTest#testValueEmission
URL: https://github.com/apache/beam/pull/6385#discussion_r217472350
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/streaming/UnboundedSourceWrapperTest.java
 ##
 @@ -134,11 +133,13 @@ public void testValueEmission() throws Exception {
   public void run() {
 while (true) {
   try {
-synchronized (testHarness.getCheckpointLock()) {
-  
testHarness.setProcessingTime(System.currentTimeMillis());
-}
+testHarness.setProcessingTime(System.currentTimeMillis());
 Thread.sleep(1000);
+  } catch (InterruptedException e) {
+// this is ok
+break;
   } catch (Exception e) {
+e.printStackTrace();
 
 Review comment:
   Why not re-throw the exception or log it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144021)
Time Spent: 20m  (was: 10m)

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3194) Support annotating that a DoFn requires stable / deterministic input for replay/retry

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3194?focusedWorklogId=144020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144020
 ]

ASF GitHub Bot logged work on BEAM-3194:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:34
Start Date: 13/Sep/18 17:34
Worklog Time Spent: 10m 
  Work Description: robinyqiu opened a new pull request #6388: [BEAM-3194] 
Fail if @RequiresStableInput is used on runners that don't supoort it
URL: https://github.com/apache/beam/pull/6388
 
 
   r: @lukecwik 
   cc: @kennknowles @aaltay
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144020)
Time Spent: 2.5h  (was: 2h 20m)

> Support annotating that a DoFn requires stable / deterministic input for 
> replay/retry
> -
>
> Key: BEAM-3194
> URL: https://issues.apache.org/jira/browse/BEAM-3194
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Yueyang Qiu
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> See the thread: 
> https://lists.apache.org/thread.html/5fd81ce371aeaf642665348f8e6940e308e04275dd7072f380f9f945@%3Cdev.beam.apache.org%3E
> We need this in order to have truly cross-runner end-to-end exactly once via 
> replay + idempotence.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3878) Improve error reporting in calls.go

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3878?focusedWorklogId=144018=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144018
 ]

ASF GitHub Bot logged work on BEAM-3878:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:29
Start Date: 13/Sep/18 17:29
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #6156: [BEAM-3878] Improve 
error reporting in calls.go
URL: https://github.com/apache/beam/pull/6156#issuecomment-421087923
 
 
   @huygaa11 I'll wait for Holden to respond to my suggestion.
   
   cc: @lostluck 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144018)
Time Spent: 50m  (was: 40m)

> Improve error reporting in calls.go
> ---
>
> Key: BEAM-3878
> URL: https://issues.apache.org/jira/browse/BEAM-3878
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The error messages generated in calls.go are not as helpful as they could be.
> Instead of simply reporting "incompatible func type" it would be great if 
> they reported the topology of the actual function supplied versus what is 
> expected. That would make debugging a lot easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Go_GradleBuild #920

2018-09-13 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3446) RedisIO non-prefix read operations

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3446?focusedWorklogId=144017=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144017
 ]

ASF GitHub Bot logged work on BEAM-3446:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:26
Start Date: 13/Sep/18 17:26
Worklog Time Spent: 10m 
  Work Description: huygaa11 commented on issue #5841: [BEAM-3446] Fixes 
RedisIO non-prefix read operations
URL: https://github.com/apache/beam/pull/5841#issuecomment-421086985
 
 
   Friendly ping for review!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144017)
Time Spent: 5h  (was: 4h 50m)

> RedisIO non-prefix read operations
> --
>
> Key: BEAM-3446
> URL: https://issues.apache.org/jira/browse/BEAM-3446
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-redis
>Reporter: Vinay varma
>Assignee: Vinay varma
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Read operation in RedisIO is for prefix based look ups. While this can be 
> used for exact key matches as well, the number of operations limits the 
> through put of the function.
> I suggest exposing current readAll operation as readbyprefix and using more 
> simpler operations for readAll functionality.
> ex:
> {code:java}
> String output = jedis.get(element);
> if (output != null) {
> processContext.output(KV.of(element, output));
> }
> {code}
> instead of:
> https://github.com/apache/beam/blob/7d240c0bb171af6868f1a6e95196c9dcfc9ac640/sdks/java/io/redis/src/main/java/org/apache/beam/sdk/io/redis/RedisIO.java#L292



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3711) Implement portable Combiner lifting in Dataflow Runner

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3711?focusedWorklogId=144016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144016
 ]

ASF GitHub Bot logged work on BEAM-3711:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:25
Start Date: 13/Sep/18 17:25
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #6384: [BEAM-3711] Bug fix 
and improvements in Dataflow transform override.
URL: https://github.com/apache/beam/pull/6384#issuecomment-421086750
 
 
   R: @Ardagan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144016)
Time Spent: 50m  (was: 40m)

> Implement portable Combiner lifting in Dataflow Runner
> --
>
> Key: BEAM-3711
> URL: https://issues.apache.org/jira/browse/BEAM-3711
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Labels: portability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This is the task to make sure that once other parts of portable Combiner 
> lifting are complete, that the actual Combiner lifting can be done in the 
> Dataflow Runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3878) Improve error reporting in calls.go

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3878?focusedWorklogId=144015=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144015
 ]

ASF GitHub Bot logged work on BEAM-3878:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:25
Start Date: 13/Sep/18 17:25
Worklog Time Spent: 10m 
  Work Description: huygaa11 commented on issue #6156: [BEAM-3878] Improve 
error reporting in calls.go
URL: https://github.com/apache/beam/pull/6156#issuecomment-421086652
 
 
   @herohde friendly ping!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144015)
Time Spent: 40m  (was: 0.5h)

> Improve error reporting in calls.go
> ---
>
> Key: BEAM-3878
> URL: https://issues.apache.org/jira/browse/BEAM-3878
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The error messages generated in calls.go are not as helpful as they could be.
> Instead of simply reporting "incompatible func type" it would be great if 
> they reported the topology of the actual function supplied versus what is 
> expected. That would make debugging a lot easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5379) Go Modules versioning support

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5379?focusedWorklogId=144014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144014
 ]

ASF GitHub Bot logged work on BEAM-5379:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:19
Start Date: 13/Sep/18 17:19
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #6387: [BEAM-5379] Add 
getting started instructions for go-sdk
URL: https://github.com/apache/beam/pull/6387
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/go/README.md b/sdks/go/README.md
index f9a08e0b6db..247c287665c 100644
--- a/sdks/go/README.md
+++ b/sdks/go/README.md
@@ -90,4 +90,39 @@ SDK harness container image.
 
 ## Issues
 
-Please use the `sdk-go` component for any bugs or feature requests.
\ No newline at end of file
+Please use the 
[`sdk-go`](https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20sdk-go%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC)
 component for any bugs or feature requests.
+
+## Contributing to the Go SDK
+
+### New to developing Go?
+https://tour.golang.org : The Go Tour gives you the basics of the language, 
interactively no installation required.
+
+https://github.com/campoy/go-tooling-workshop is a great start on learning 
good (optional) development tools for Go. 
+
+### Developing Go Beam SDK on Github
+
+To make and test changes when working with Go, it's neecessary to clone your 
repository 
+in a subdirectory of your GOPATH. This permits existing gradle tools to use 
your in progress changes.
+
+```
+# Create a Go compatible place for the repo, using src/github.com/apache/
+# matches where Go will look for the files, or go get would put them.
+$ mkdir -p $GOPATH/src/github.com/apache/
+$ cd $GOPATH/src/github.com/apache/
+
+
+# Clone the repo, and update your branch as normal
+$ git clone https://github.com/apache/beam.git
+$ cd beam
+$ git remote add  g...@github.com:/beam.git
+$ git fetch --all
+
+# Get or Update all the Go SDK dependencies
+$ go get -u ./...
+# Test that the system compiles and runs.
+$ go test ./...
+```
+
+If you don’t have a GOPATH set, create a new directory in your home directory, 
and use that.
+
+Follow the [contribution 
guide](https://beam.apache.org/contribute/contribution-guide/#code) to create 
branches, and submit pull requests as normal.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144014)
Time Spent: 0.5h  (was: 20m)

> Go Modules versioning support
> -
>
> Key: BEAM-5379
> URL: https://issues.apache.org/jira/browse/BEAM-5379
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This would make it easier for non-Go developers to update and test changes to 
> the Go SDK without jumping through hoops to set up Go Paths at first.
> Right now, we us the gogradle plugin for gradle to handle re-producible 
> builds. Without doing something with the GO_PATH relative to a user's local 
> git repo though, changes made in the user's repo are not represented when 
> gradle is invoked to test everything.
> One of at least the following needs to be accomplished:
> * gogradle moves to support the Go Modules experiment in Go 1.11, and the SDK 
> migrates to that
> * or we re-implement our gradle go rules ourselves to use them, 
> * or some third option, that moves away from the GO_PATH nit.
> This issue should be resolved after deciding and implementing a clear 
> versioning story for the SDK, ideally along Go best practices.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (41a1362 -> 14cef9c)

2018-09-13 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 41a1362  Merge pull request #6340: [BEAM-5283] Fixing Flink Post 
commit jenkins task
 add 5aa6714  Add getting started instructions for go-sdk
 add 14cef9c  Merge pull request #6387 from lostluck/patch-12

No new revisions were added by this update.

Summary of changes:
 sdks/go/README.md | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)



[jira] [Work logged] (BEAM-5379) Go Modules versioning support

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5379?focusedWorklogId=144012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144012
 ]

ASF GitHub Bot logged work on BEAM-5379:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:17
Start Date: 13/Sep/18 17:17
Worklog Time Spent: 10m 
  Work Description: lostluck opened a new pull request #6387: [BEAM-5379] 
Add getting started instructions for go-sdk
URL: https://github.com/apache/beam/pull/6387
 
 
   There's currently a getting started nit for developing the Go SDK. While 
this will be solved long term with the new versioning/module support in Go 
1.11, we're not quite there yet. These instructions are a stopgap until the 
GO_PATH hack is unnecessary, and we've got a clear versioning story for the Go 
SDK.
   
   
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144012)
Time Spent: 10m
Remaining Estimate: 0h

> Go Modules versioning support
> -
>
> Key: BEAM-5379
> URL: https://issues.apache.org/jira/browse/BEAM-5379
> Project: Beam
>  Issue Type: Improvement
>  Components: 

[jira] [Work logged] (BEAM-5379) Go Modules versioning support

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5379?focusedWorklogId=144013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144013
 ]

ASF GitHub Bot logged work on BEAM-5379:


Author: ASF GitHub Bot
Created on: 13/Sep/18 17:17
Start Date: 13/Sep/18 17:17
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #6387: [BEAM-5379] Add 
getting started instructions for go-sdk
URL: https://github.com/apache/beam/pull/6387#issuecomment-421084097
 
 
   R: @aaltay Please review!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144013)
Time Spent: 20m  (was: 10m)

> Go Modules versioning support
> -
>
> Key: BEAM-5379
> URL: https://issues.apache.org/jira/browse/BEAM-5379
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This would make it easier for non-Go developers to update and test changes to 
> the Go SDK without jumping through hoops to set up Go Paths at first.
> Right now, we us the gogradle plugin for gradle to handle re-producible 
> builds. Without doing something with the GO_PATH relative to a user's local 
> git repo though, changes made in the user's repo are not represented when 
> gradle is invoked to test everything.
> One of at least the following needs to be accomplished:
> * gogradle moves to support the Go Modules experiment in Go 1.11, and the SDK 
> migrates to that
> * or we re-implement our gradle go rules ourselves to use them, 
> * or some third option, that moves away from the GO_PATH nit.
> This issue should be resolved after deciding and implementing a clear 
> versioning story for the SDK, ideally along Go best practices.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5379) Go Modules versioning support

2018-09-13 Thread Robert Burke (JIRA)
Robert Burke created BEAM-5379:
--

 Summary: Go Modules versioning support
 Key: BEAM-5379
 URL: https://issues.apache.org/jira/browse/BEAM-5379
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Robert Burke
Assignee: Robert Burke


This would make it easier for non-Go developers to update and test changes to 
the Go SDK without jumping through hoops to set up Go Paths at first.

Right now, we us the gogradle plugin for gradle to handle re-producible builds. 
Without doing something with the GO_PATH relative to a user's local git repo 
though, changes made in the user's repo are not represented when gradle is 
invoked to test everything.



One of at least the following needs to be accomplished:
* gogradle moves to support the Go Modules experiment in Go 1.11, and the SDK 
migrates to that
* or we re-implement our gradle go rules ourselves to use them, 
* or some third option, that moves away from the GO_PATH nit.

This issue should be resolved after deciding and implementing a clear 
versioning story for the SDK, ideally along Go best practices.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread Mikhail Gryzykhin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613800#comment-16613800
 ] 

Mikhail Gryzykhin commented on BEAM-5337:
-

And more:
[https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1443/console]

[https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1451/console]

[https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1454/console]

 

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5357) Go check for IsWorkerCompatibleBinary is wrong

2018-09-13 Thread Robert Burke (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke reassigned BEAM-5357:
--

Assignee: (was: Robert Burke)

> Go check for IsWorkerCompatibleBinary is wrong
> --
>
> Key: BEAM-5357
> URL: https://issues.apache.org/jira/browse/BEAM-5357
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Henning Rohde
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Per BEAM-5253, The linux/amd64 check in IsWorkerCompatibleBinary is 
> insufficient:
> https://github.com/apache/beam/blob/609a42978405173a60e5d91f35170a5c0b5d5332/sdks/go/pkg/beam/runners/universal/runnerlib/compile.go#L37
> We need to see if we can do a better check here (such as looking up the 
> symbol table or similar) or disable this optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5337) [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] Build times out in beam-runners-flink target

2018-09-13 Thread Mikhail Gryzykhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Gryzykhin updated BEAM-5337:

Priority: Critical  (was: Major)

> [beam_PostCommit_Java_GradleBuild][:beam-runners-flink_2.11:test][Flake] 
> Build times out in beam-runners-flink target
> -
>
> Key: BEAM-5337
> URL: https://issues.apache.org/jira/browse/BEAM-5337
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Maximilian Michels
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Job times out. 
>  Failing job url:
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1414/consoleFull]
> [https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1406/consoleFull]
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1408/consoleFull
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4603) Go SDK should verify that encoded types can be decoded

2018-09-13 Thread Robert Burke (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613782#comment-16613782
 ] 

Robert Burke commented on BEAM-4603:


Is there an example you can point out for this? It's hard to go on without 
something to start with that demonstrates the issue.

> Go SDK should verify that encoded types can be decoded
> --
>
> Key: BEAM-4603
> URL: https://issues.apache.org/jira/browse/BEAM-4603
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Henning Rohde
>Priority: Major
>
> The reflect package has some restrictions and it would be helpful to find 
> those at job submission as opposed to runtime. The worker panics outside user 
> code in such a case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5365) Migrate integration tests for bigquery_tornadoes

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5365?focusedWorklogId=144007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144007
 ]

ASF GitHub Bot logged work on BEAM-5365:


Author: ASF GitHub Bot
Created on: 13/Sep/18 16:55
Start Date: 13/Sep/18 16:55
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #6380: Do Not Merge, 
[BEAM-5365] adding more tests in bigquery_tornadoes_it
URL: https://github.com/apache/beam/pull/6380#issuecomment-421076802
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144007)
Time Spent: 40m  (was: 0.5h)

> Migrate integration tests for bigquery_tornadoes
> 
>
> Key: BEAM-5365
> URL: https://issues.apache.org/jira/browse/BEAM-5365
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5365) Migrate integration tests for bigquery_tornadoes

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5365?focusedWorklogId=144009=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144009
 ]

ASF GitHub Bot logged work on BEAM-5365:


Author: ASF GitHub Bot
Created on: 13/Sep/18 16:55
Start Date: 13/Sep/18 16:55
Worklog Time Spent: 10m 
  Work Description: yifanzou removed a comment on issue #6380: Do Not 
Merge, [BEAM-5365] adding more tests in bigquery_tornadoes_it
URL: https://github.com/apache/beam/pull/6380#issuecomment-420768206
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144009)
Time Spent: 1h  (was: 50m)

> Migrate integration tests for bigquery_tornadoes
> 
>
> Key: BEAM-5365
> URL: https://issues.apache.org/jira/browse/BEAM-5365
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5365) Migrate integration tests for bigquery_tornadoes

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5365?focusedWorklogId=144008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144008
 ]

ASF GitHub Bot logged work on BEAM-5365:


Author: ASF GitHub Bot
Created on: 13/Sep/18 16:55
Start Date: 13/Sep/18 16:55
Worklog Time Spent: 10m 
  Work Description: yifanzou removed a comment on issue #6380: Do Not 
Merge, [BEAM-5365] adding more tests in bigquery_tornadoes_it
URL: https://github.com/apache/beam/pull/6380#issuecomment-420784970
 
 
   Run Python PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144008)
Time Spent: 50m  (was: 40m)

> Migrate integration tests for bigquery_tornadoes
> 
>
> Key: BEAM-5365
> URL: https://issues.apache.org/jira/browse/BEAM-5365
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5376) Row interface doesn't support nullability on all fields.

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5376?focusedWorklogId=144005=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144005
 ]

ASF GitHub Bot logged work on BEAM-5376:


Author: ASF GitHub Bot
Created on: 13/Sep/18 16:41
Start Date: 13/Sep/18 16:41
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on a change in pull request #6383: 
[BEAM-5376] Support nullability on all Row types
URL: https://github.com/apache/beam/pull/6383#discussion_r217455483
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaBeanSchemaTest.java
 ##
 @@ -155,10 +155,10 @@ public void testRecursiveGetters() throws 
NoSuchSchemaException {
 
 Row nestedRow = row.getRow("nested");
 assertEquals("string", nestedRow.getString("str"));
-assertEquals((byte) 1, nestedRow.getByte("aByte"));
-assertEquals((short) 2, nestedRow.getInt16("aShort"));
-assertEquals((int) 3, nestedRow.getInt32("anInt"));
-assertEquals((long) 4, nestedRow.getInt64("aLong"));
+assertEquals((byte) 1, (Object) nestedRow.getByte("aByte"));
 
 Review comment:
   or Byte.valueof("1") ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144005)
Time Spent: 1h 40m  (was: 1.5h)

> Row interface doesn't support nullability on all fields.
> 
>
> Key: BEAM-5376
> URL: https://issues.apache.org/jira/browse/BEAM-5376
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> For example: 
> {code:java}
> public boolean getBoolean(int idx);{code}
> Should be:
> {code:java}
> public Boolean getBoolean(int idx);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5354) Side Inputs seems to be non-working in the sdk-go

2018-09-13 Thread Henning Rohde (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613736#comment-16613736
 ] 

Henning Rohde commented on BEAM-5354:
-

Technically, the side input feature on Dataflow is not implemented yet: 
https://issues.apache.org/jira/browse/BEAM-3286. I ran into an error on 
Dataflow that might be explained by Robert's work, but never got back to it. So 
YMMV.

> Side Inputs seems to be non-working in the sdk-go
> -
>
> Key: BEAM-5354
> URL: https://issues.apache.org/jira/browse/BEAM-5354
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Tomas Roos
>Priority: Major
>
> Running the contains example fails with
>  
> {code:java}
> Output i0 for step was not found.
> {code}
> This is because of the call to debug.Head (which internally uses SideInput)
> Removing the following line 
> [https://github.com/apache/beam/blob/master/sdks/go/examples/contains/contains.go#L50]
>  
> The pipeline executes well.
>  
> Executed on id's
>  
> go-job-1-1536664417610678545 
> vs
> go-job-1-1536664934354466938
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5378) Ensure all Go SDK examples run successfully

2018-09-13 Thread Robert Burke (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613728#comment-16613728
 ] 

Robert Burke edited comment on BEAM-5378 at 9/13/18 4:28 PM:
-

For an overview of certain design choices made for the Go SDK, please see the 
RFC that was published: along side it's initial import: 

[https://s.apache.org/beam-go-sdk-design-rfc]

Relevant to one of your questions above, is that Go doesn't have serialized 
type support. This means that Structural DoFns need to have their types 
registered (with beam.RegisterType) so that distributed runners can run them on 
worker machines properly. The direct runner is as described "direct", and 
doesn't serialize anything, leading to it not to require these registrations.


was (Author: lostluck):
For an overview of certain design choices made for the Go SDK, please see the 
RFC that was published: along side it's initial import: 

[https://s.apache.org/beam-go-sdk-design-rfc
]

Relevant to one of your questions above, is that Go doesn't have serialized 
type support. This means that Structural DoFns need to have their types 
registered (with beam.RegisterType) so that distributed runners can run them on 
worker machines properly. The direct runner is as described "direct", and 
doesn't serialize anything, leading to it not to require these registrations.

> Ensure all Go SDK examples run successfully
> ---
>
> Key: BEAM-5378
> URL: https://issues.apache.org/jira/browse/BEAM-5378
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: Not applicable
>Reporter: Tomas Roos
>Priority: Major
>
> I've been spending a day or so running through the example available for the 
> Go SDK in order to see what works and on what runner (direct, dataflow), and 
> what doesn't and here's the results.
> All available examples for the go sdk. For me as a new developer on apache 
> beam and dataflow it would be a tremendous value to have all examples running 
> because many of them have legitimate use-cases behind them. 
> {code:java}
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> ├── contains
> │   └── contains.go
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> │   ├── filter
> │   │   └── filter.go
> │   ├── join
> │   │   └── join.go
> │   ├── max
> │   │   └── max.go
> │   └── tornadoes
> │   └── tornadoes.go
> ├── debugging_wordcount
> │   └── debugging_wordcount.go
> ├── forest
> │   └── forest.go
> ├── grades
> │   └── grades.go
> ├── minimal_wordcount
> │   └── minimal_wordcount.go
> ├── multiout
> │   └── multiout.go
> ├── pingpong
> │   └── pingpong.go
> ├── streaming_wordcap
> │   └── wordcap.go
> ├── windowed_wordcount
> │   └── windowed_wordcount.go
> ├── wordcap
> │   └── wordcap.go
> ├── wordcount
> │   └── wordcount.go
> └── yatzy
> └── yatzy.go
> {code}
> All examples that are supposed to be runnable by the direct driver (not 
> depending on gcp platform services) are runnable.
> On the otherhand these are the tests that needs to be updated because its not 
> runnable on the dataflow platform for various reasons.
> I tried to figure them out and all I can do is to pin point at least where it 
> fails since my knowledge so far in the beam / dataflow internals is limited.
> .
> ├── complete
> │   └── autocomplete
> │   └── autocomplete.go
> Runs successfully if swapping the input to one of the shakespear data files 
> from gs://
> But when running this it yields a error from the top.Largest func (discussed 
> in another issue that top.Largest needs to have a serializeable combinator / 
> accumulator)
> ➜  autocomplete git:(master) ✗ ./autocomplete --project fair-app-213019 
> --runner dataflow --staging_location=gs://fair-app-213019/staging-test2 
> --worker_harness_container_image=apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515
>  
> 2018/09/11 15:35:26 Running autocomplete
> Unable to encode combiner for lifting: failed to encode custom coder: bad 
> underlying type: bad field type: bad element: unencodable type: interface 
> {}2018/09/11 15:35:26 Using running binary as worker binary: './autocomplete'
> 2018/09/11 15:35:26 Staging worker binary: ./autocomplete
> ├── contains
> │   └── contains.go
> Fails when running debug.Head for some mysterious reason, might have to do 
> with the param passing into the x,y iterator. Frankly I dont know and could 
> not figure.
> But removing the debug.Head call everything works as expected and succeeds.
> ├── cookbook
> │   ├── combine
> │   │   └── combine.go
> Fails because of extractFn which is a struct is not registered through the 
> beam.RegisterType (is this a must or not?)
> It works as a work around at least
> ➜  combine git:(master) ✗ ./combine 
> 

[jira] [Work logged] (BEAM-5376) Row interface doesn't support nullability on all fields.

2018-09-13 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5376?focusedWorklogId=144002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-144002
 ]

ASF GitHub Bot logged work on BEAM-5376:


Author: ASF GitHub Bot
Created on: 13/Sep/18 16:28
Start Date: 13/Sep/18 16:28
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #6383: 
[BEAM-5376] Support nullability on all Row types
URL: https://github.com/apache/beam/pull/6383#discussion_r217450564
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/schemas/JavaBeanSchemaTest.java
 ##
 @@ -155,10 +155,10 @@ public void testRecursiveGetters() throws 
NoSuchSchemaException {
 
 Row nestedRow = row.getRow("nested");
 assertEquals("string", nestedRow.getString("str"));
-assertEquals((byte) 1, nestedRow.getByte("aByte"));
-assertEquals((short) 2, nestedRow.getInt16("aShort"));
-assertEquals((int) 3, nestedRow.getInt32("anInt"));
-assertEquals((long) 4, nestedRow.getInt64("aLong"));
+assertEquals((byte) 1, (Object) nestedRow.getByte("aByte"));
 
 Review comment:
   It is unfortunate that there's `assertEquals(long, long)` overload that is 
applicable here.
   
   Correct way of doing `Byte.valueOf()` is `Byte.valueOf((byte) 1)`, which is 
ugly. Probably even uglier way is `(Byte)(byte)1`. Another ugly approach is to 
create a local wrapper method `void myAssertEquals(Object a, Object b) { 
assertEquals(a, b); }`.
   
   I mentioned `assertTrue()` as an example, not suggesting to use it.
   
   To Rui's point, though, if you look at how `Byte.equals()` works, it accepts 
an `Object` and casts it to `Byte`, so in this case for semantics it shouldn't 
matter whether to cast it at all, to cast it to `Byte` or `Object`. Casting 
here only allows to unambiguously choose the `assertEquals(Object, Object)` 
overload over `assertEquals(long, long)`, and I don't have an idea how to do 
that cleanly, so I don't mind either your approach or any of the proposed in 
the discussion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 144002)
Time Spent: 1.5h  (was: 1h 20m)

> Row interface doesn't support nullability on all fields.
> 
>
> Key: BEAM-5376
> URL: https://issues.apache.org/jira/browse/BEAM-5376
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> For example: 
> {code:java}
> public boolean getBoolean(int idx);{code}
> Should be:
> {code:java}
> public Boolean getBoolean(int idx);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >