[jira] [Resolved] (BEAM-8947) Change Jenkins jobs configuration to account for new Flink Docker image names

2019-12-19 Thread Michal Walenia (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michal Walenia resolved BEAM-8947.
--
Fix Version/s: Not applicable
   Resolution: Fixed

> Change Jenkins jobs configuration to account for new Flink Docker image names
> -
>
> Key: BEAM-8947
> URL: https://issues.apache.org/jira/browse/BEAM-8947
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> After changes in 8b859ab8a52778d1bc14ca76f2eef7c9e70d528d Flink Docker 
> containers changed their names. Because of that, Jenkins jobs that weren't 
> changed to account for it fail.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8941) Create a common place for Load Tests configuration

2019-12-19 Thread Pawel Pasterz (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pawel Pasterz reassigned BEAM-8941:
---

Assignee: Pawel Pasterz

> Create a common place for Load Tests configuration
> --
>
> Key: BEAM-8941
> URL: https://issues.apache.org/jira/browse/BEAM-8941
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Pawel Pasterz
>Priority: Minor
> Fix For: Not applicable
>
>
> The Apache Beam community maintains different versions of each Load Test. For 
> example, right now, there are two versions of all Python Load Tests: the 
> first one runs on Dataflow runner, and the second one runs on Flink. With the 
> lack of a common place where configuration for the tests can be stored, the 
> configuration is duplicated many times with minimal differences.
> The goal is to create a common place for the configuration, so that it could 
> be passed to different files with tests (.test-infra/jenkins/*.groovy) and 
> filtered according to needs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9008) add readAll() method to CassandraIO

2019-12-19 Thread vincent marquez (Jira)
vincent marquez created BEAM-9008:
-

 Summary: add readAll() method to CassandraIO
 Key: BEAM-9008
 URL: https://issues.apache.org/jira/browse/BEAM-9008
 Project: Beam
  Issue Type: New Feature
  Components: io-java-cassandra
Affects Versions: 2.16.0
Reporter: vincent marquez


When querying a large cassandra database, it's often *much* more useful to 
programatically generate the queries needed to to be run rather than reading 
all partitions and attempting some filtering.  

As an example:
{code:java}
public class Event { 
   @PartitionKey(0) public UUID accountId;
   @PartitionKey(1)public String yearMonthDay; 
   @ClusteringKey public UUID eventId;  
   //other data...
}{code}
If there is ten years worth of data, you may want to only query one year's 
worth.  Here each token range would represent one 'token' but all events for 
the day. 
{code:java}
Set accounts = getRelevantAccounts();
Set dateRange = generateDateRange("2018-01-01", "2019-01-01");
PCollection tokens = generateTokens(accounts, dateRange); 
{code}
 

 I propose an additional _readAll()_ PTransform that can take a PCollection of 
token ranges and can return a PCollection of what the query would return. 

*Question: How much code should be in common between both methods?* 
Currently the read connector already groups all partitions into a List of Token 
Ranges, so it would be simple to refactor the current read() based method to a 
'ParDo' based one and have them both share the same function.  Reasons against 
sharing code between read and readAll
 * Not having the read based method return a BoundedSource connector would mean 
losing the ability to know the size of the data returned

 * Currently the CassandraReader executes all the grouped TokenRange queries 
*asynchronously* which is (maybe?) fine when all that's happening is splitting 
up all the partition ranges but terrible for executing potentially millions of 
queries. 

 Reasons _for_ sharing code would be simplified code base and that both of the 
above issues would most likely have a negligable performance impact. 



 

 






 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9005) Go SDK post-commit failures due to https://github.com/apache/beam/pull/10183

2019-12-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9005?focusedWorklogId=361515&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-361515
 ]

ASF GitHub Bot logged work on BEAM-9005:


Author: ASF GitHub Bot
Created on: 20/Dec/19 04:55
Start Date: 20/Dec/19 04:55
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #10432: 
[BEAM-9005] Setting environment ID for ParDo and Combine transforms
URL: https://github.com/apache/beam/pull/10432
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.a

[jira] [Updated] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-9000:
--
Description: 
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are ~30 comparison failure due to this {{toString}} 
assertions.

They are subclasses of {{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
Credit: Ben Whitehead.
h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}
A feature request to handle escaping double quotes via JacksonFactory: 
[https://github.com/googleapis/google-http-java-client/issues/923]

 
h1. Option3: Check JSON equality via JSONassert

* https://github.com/skyscreamer/JSONassert
* https://github.com/hertzsprung/hamcrest-json (Not using as last commit was in 
2012) 

The JSONassert example does not carry quoted double quote characters. The 
implementation would be converting actual object into JSON object and calling 
{{JSONAssert.assertEqual}}.

Credit: Luke Cwik

 

  was:
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are ~30 comparison failure due to this {{toString}} 
assertions.

They are subclasses of {{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
Credit: Ben Whitehead.
h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactor

[jira] [Updated] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-9000:
--
Description: 
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are ~30 comparison failure due to this {{toString}} 
assertions.

They are subclasses of {{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
Credit: Ben Whitehead.
h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}
A feature request to handle escaping double quotes via JacksonFactory: 
[https://github.com/googleapis/google-http-java-client/issues/923]

 
h1. Check JSON equality via JSONassert

* https://github.com/skyscreamer/JSONassert
* https://github.com/hertzsprung/hamcrest-json (Not using as last commit was in 
2012) 

The JSONassert example does not carry quoted double quote characters. The 
implementation would be converting actual object into JSON object and calling 
{{JSONAssert.assertEqual}}.

Credit: Luke Cwik

 

  was:
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are ~30 comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}

Credit: Ben Whitehead.

h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getD

[jira] [Updated] (BEAM-9007) beam.DoFn setup() will call several times when using python subprocess

2019-12-19 Thread Hokuto Tateyama (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hokuto Tateyama updated BEAM-9007:
--
Summary: beam.DoFn setup() will call several times when using python 
subprocess  (was: beam.ParDo setup() will call several times when using python 
subprocess)

> beam.DoFn setup() will call several times when using python subprocess
> --
>
> Key: BEAM-9007
> URL: https://issues.apache.org/jira/browse/BEAM-9007
> Project: Beam
>  Issue Type: Bug
>  Components: beam-community, examples-python
>Affects Versions: 2.15.0, 2.16.0
> Environment: python 3.5
> apache-beam[gcp] == 2.16.*
> google-cloud-storage == 1.23.*
> google-resumable-media == 0.5.*
> googleapis-common-protos == 1.6.*
> grpc-google-logging-v2 == 0.11.*
>Reporter: Hokuto Tateyama
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Minor
>
> Hello. 
>  I`m trying to use a make command on dataflow to use OpenCV source written in 
> C++.
> I was thinking, *setup()* function on *beam.DoFn* will run only once a time 
> before the process runs.
>  So I tried to run build commands on the setup() function, and it will run 
> successfully.
> h1. Problem
> After the running process, the setup() function will run again and try to 
> build commands several times. I`ve checked these logs from my stack driver.
> h1. Codes
> These are my codes using dataflow. I defined the command_list in the class 
> that inheritance from beam.DoFn and call run_cmd() from setup().
> ・Run command lines.
> {code:python}
> def run_cmd(command_list: List[List[str]], shell: bool = False) -> 
> List[Dict[str, Any]]:
>   outputs = []
>   try:
>   for cmd in command_list:
>   logging.info(cmd)
>   proc = subprocess.check_output(
>   cmd, shell=shell, stderr=subprocess.STDOUT, 
> universal_newlines=True)
>   outputs.append({“Input: “: cmd, “Output: “: proc})
>   except subprocess.CalledProcessError as e:
>   logging.warning(“Return code:{}, 
> Output:{}”.format(e.returncode, e.output))
>   return outputs{code}
> ・Command list to pass run_cmd() function.
> {code:python}
> command_list = [
> [“cat /etc/issue”],
> [“apt-get —assume-yes update”],
> [
> “apt-get —assume-yes install —no-install-recommends ffmpeg git 
> software-properties-common”
> ],
> [“apt-get install -y software-properties-common”],
> [
> ‘add-apt-repository -s “deb http://security.ubuntu.com/ubuntu 
> bionic-security main”’
> ],
> [
> “apt-get install -y build-essential checkinstall cmake unzip 
> pkg-config yasm unzip”
> ],
> [“apt-get -y install git gfortran python3-dev”],
> [
> “apt-get -y install libjpeg62-turbo-dev libpng-dev libpng16-16 
> libavcodec-dev libavformat-dev libswscale-dev libdc1394-22-dev libxine2-dev 
> libv4l-dev”
> ],
> [“apt-get -y install libjpeg-dev libpng-dev libtiff-dev libtbb-dev”],
> [
> “apt-get -y install libavcodec-dev libavformat-dev libswscale-dev 
> libv4l-dev libatlas-base-dev libxvidcore-dev libx264-dev libgtk-3-dev”
> ],
> [“apt-get clean”],
> [“rm -rf /var/lib/apt/lists/*”],
> [“git clone https://github.com/opencv/opencv.git”],
> [“git clone https://github.com/opencv/opencv_contrib.git”],
> [“cd opencv_contrib”],
> [“git checkout -b 3.4.3 refs/tags/3.4.3”],
> [“cd ../opencv/“],
> [“git checkout -b 3.4.3 refs/tags/3.4.3”],
> [“mkdir build”],
> [“cd build”],
> [
> “cmake -D CMAKE_BUILD_TYPE=Release \
> -D CMAKE_INSTALL_PREFIX=/usr/local \
> -D WITH_TBB=ON \
> -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules ..”
> ],
> [“make -j8”],
> [“make install”],
> [“echo /usr/local/lib > /etc/ld.so.conf.d/opencv.conf”],
> [“ldconfig -v”]
> ]
> {code}
> h1. Question
> For my summary, I`m wondering if these are bugs for apache beam.
>  # What is the reason for calling setup() several times?
>  # Is there any solution to set up these commands only once in the total 
> running? This is a method what I tried.
>  ## Using os.system() instead of subprocess. I think subprocess will create 
> another process on setup() so, it can not extract process finished 
> successfully.
>  ## Writing commands on setup.py and use it for CustomCommand
>  [https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/]
>  
> Regards, Collonville



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9007) beam.ParDo setup() will call several times when using python subprocess

2019-12-19 Thread Hokuto Tateyama (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hokuto Tateyama updated BEAM-9007:
--
Description: 
Hello. 
 I`m trying to use a make command on dataflow to use OpenCV source written in 
C++.

I was thinking, *setup()* function on *beam.DoFn* will run only once a time 
before the process runs.
 So I tried to run build commands on the setup() function, and it will run 
successfully.
h1. Problem

After the running process, the setup() function will run again and try to build 
commands several times. I`ve checked these logs from my stack driver.
h1. Codes

These are my codes using dataflow. I defined the command_list in the class that 
inheritance from beam.DoFn and call run_cmd() from setup().

・Run command lines.
{code:python}
def run_cmd(command_list: List[List[str]], shell: bool = False) -> 
List[Dict[str, Any]]:
outputs = []
try:
for cmd in command_list:
logging.info(cmd)
proc = subprocess.check_output(
cmd, shell=shell, stderr=subprocess.STDOUT, 
universal_newlines=True)
outputs.append({“Input: “: cmd, “Output: “: proc})
except subprocess.CalledProcessError as e:
logging.warning(“Return code:{}, 
Output:{}”.format(e.returncode, e.output))

return outputs{code}
・Command list to pass run_cmd() function.
{code:python}
command_list = [
[“cat /etc/issue”],
[“apt-get —assume-yes update”],
[
“apt-get —assume-yes install —no-install-recommends ffmpeg git 
software-properties-common”
],
[“apt-get install -y software-properties-common”],
[
‘add-apt-repository -s “deb http://security.ubuntu.com/ubuntu 
bionic-security main”’
],
[
“apt-get install -y build-essential checkinstall cmake unzip pkg-config 
yasm unzip”
],
[“apt-get -y install git gfortran python3-dev”],
[
“apt-get -y install libjpeg62-turbo-dev libpng-dev libpng16-16 
libavcodec-dev libavformat-dev libswscale-dev libdc1394-22-dev libxine2-dev 
libv4l-dev”
],
[“apt-get -y install libjpeg-dev libpng-dev libtiff-dev libtbb-dev”],
[
“apt-get -y install libavcodec-dev libavformat-dev libswscale-dev 
libv4l-dev libatlas-base-dev libxvidcore-dev libx264-dev libgtk-3-dev”
],
[“apt-get clean”],
[“rm -rf /var/lib/apt/lists/*”],
[“git clone https://github.com/opencv/opencv.git”],
[“git clone https://github.com/opencv/opencv_contrib.git”],
[“cd opencv_contrib”],
[“git checkout -b 3.4.3 refs/tags/3.4.3”],
[“cd ../opencv/“],
[“git checkout -b 3.4.3 refs/tags/3.4.3”],
[“mkdir build”],
[“cd build”],
[
“cmake -D CMAKE_BUILD_TYPE=Release \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D WITH_TBB=ON \
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules ..”
],
[“make -j8”],
[“make install”],
[“echo /usr/local/lib > /etc/ld.so.conf.d/opencv.conf”],
[“ldconfig -v”]
]
{code}
h1. Question

For my summary, I`m wondering if these are bugs for apache beam.
 # What is the reason for calling setup() several times?
 # Is there any solution to set up these commands only once in the total 
running? This is a method what I tried.
 ## Using os.system() instead of subprocess. I think subprocess will create 
another process on setup() so, it can not extract process finished successfully.
 ## Writing commands on setup.py and use it for CustomCommand
 [https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/]

 

Regards, Collonville

  was:
Hello. 
 I`m trying to use a make command on dataflow to use OpenCV source written in 
C++.

I was thinking, *setup()* function on *beam.DoFn* will run only once a time 
before the process runs.
 So I tried to run build commands on the setup() function, and it will run 
successfully.
h1. Problem

After the running process, the setup() function will run again and try to build 
commands several times. I`ve checked these logs from my stack driver.
h1. Codes

These are my codes using dataflow. I defined the command_list in the class that 
inheritance from beam.DoFn and call run_cmd() from setup().

・Run command lines.
{code:python}
def run_cmd(command_list: List[List[str]], shell: bool = False) -> 
List[Dict[str, Any]]:
outputs = []
try:
for cmd in command_list:
logging.info(cmd)
proc = subprocess.check_output(
cmd, shell=shell, stderr=subprocess.STDOUT, 
universal_newlines=True)
outputs.append({“Input: “: cmd, “Output: “: proc})
except subprocess.CalledProcessError as e:
logging.warning(“Return code:{}, 
Output:{}”.format(e.returncode, e.output))

return outputs{code}
・Command list to pass run_cmd()

[jira] [Created] (BEAM-9007) beam.ParDo setup() will call several times when using python subprocess

2019-12-19 Thread Hokuto Tateyama (Jira)
Hokuto Tateyama created BEAM-9007:
-

 Summary: beam.ParDo setup() will call several times when using 
python subprocess
 Key: BEAM-9007
 URL: https://issues.apache.org/jira/browse/BEAM-9007
 Project: Beam
  Issue Type: Bug
  Components: beam-community, examples-python
Affects Versions: 2.16.0, 2.15.0
 Environment: python 3.5
apache-beam[gcp] == 2.16.*
google-cloud-storage == 1.23.*
google-resumable-media == 0.5.*
googleapis-common-protos == 1.6.*
grpc-google-logging-v2 == 0.11.*

Reporter: Hokuto Tateyama
Assignee: Aizhamal Nurmamat kyzy


Hello. 
 I`m trying to use a make command on dataflow to use OpenCV source written in 
C++.

I was thinking, *setup()* function on *beam.DoFn* will run only once a time 
before the process runs.
 So I tried to run build commands on the setup() function, and it will run 
successfully.
h1. Problem

After the running process, the setup() function will run again and try to build 
commands several times. I`ve checked these logs from my stack driver.
h1. Codes

These are my codes using dataflow. I defined the command_list in the class that 
inheritance from beam.DoFn and call run_cmd() from setup().

・Run command lines.
{code:python}
def run_cmd(command_list: List[List[str]], shell: bool = False) -> 
List[Dict[str, Any]]:
outputs = []
try:
for cmd in command_list:
logging.info(cmd)
proc = subprocess.check_output(
cmd, shell=shell, stderr=subprocess.STDOUT, 
universal_newlines=True)
outputs.append({“Input: “: cmd, “Output: “: proc})
except subprocess.CalledProcessError as e:
logging.warning(“Return code:{}, 
Output:{}”.format(e.returncode, e.output))

return outputs{code}
・Command list to pass run_cmd() function.
{code:python}
command_list = [
[“cat /etc/issue”],
[“apt-get —assume-yes update”],
[
“apt-get —assume-yes install —no-install-recommends ffmpeg git 
software-properties-common”
],
[“apt-get install -y software-properties-common”],
[
‘add-apt-repository -s “deb http://security.ubuntu.com/ubuntu 
bionic-security main”’
],
[
“apt-get install -y build-essential checkinstall cmake unzip pkg-config 
yasm unzip”
],
[“apt-get -y install git gfortran python3-dev”],
[
“apt-get -y install libjpeg62-turbo-dev libpng-dev libpng16-16 
libavcodec-dev libavformat-dev libswscale-dev libdc1394-22-dev libxine2-dev 
libv4l-dev”
],
[“apt-get -y install libjpeg-dev libpng-dev libtiff-dev libtbb-dev”],
[
“apt-get -y install libavcodec-dev libavformat-dev libswscale-dev 
libv4l-dev libatlas-base-dev libxvidcore-dev libx264-dev libgtk-3-dev”
],
[“apt-get clean”],
[“rm -rf /var/lib/apt/lists/*”],
[“git clone https://github.com/opencv/opencv.git”],
[“git clone https://github.com/opencv/opencv_contrib.git”],
[“cd opencv_contrib”],
[“git checkout -b 3.4.3 refs/tags/3.4.3”],
[“cd ../opencv/“],
[“git checkout -b 3.4.3 refs/tags/3.4.3”],
[“mkdir build”],
[“cd build”],
[
“cmake -D CMAKE_BUILD_TYPE=Release \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D WITH_TBB=ON \
-D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules ..”
],
[“make -j8”],
[“make install”],
[“echo /usr/local/lib > /etc/ld.so.conf.d/opencv.conf”],
[“ldconfig -v”]
]
{code}
h1. Question

For my summary, I`m wondering if these are bugs for apache beam.
 # What is the reason for calling setup() several times?
 # Is there any solution to set up these commands only once in the total 
running? This is a method what I tried.
 ## Using os.system() instead of subprocess. I think subprocess will create 
another process on setup() so, it can not extract process finished successfully.
 ## Writing commands on setup.py and use it for CustomCommand
[https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/]

 

R_egards, Collonville_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9006) Meta space memory leak caused by the shutdown hook of ProcessManager

2019-12-19 Thread sunjincheng (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-9006:
--
Fix Version/s: (was: 2.18.0)
   2.19.0

> Meta space memory leak caused by the shutdown hook of ProcessManager 
> -
>
> Key: BEAM-9006
> URL: https://issues.apache.org/jira/browse/BEAM-9006
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.19.0
>
>
> Currently the class `ProcessManager` will add a shutdown hook to stop all the 
> living processes before JVM exits. The shutdown hook will never be removed. 
> If this class is loaded by the user class loader, it will cause the user 
> class loader could not be garbage collected which causes meta space memory 
> leak eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9006) Meta space memory leak caused by the shutdown hook of ProcessManager

2019-12-19 Thread sunjincheng (Jira)
sunjincheng created BEAM-9006:
-

 Summary: Meta space memory leak caused by the shutdown hook of 
ProcessManager 
 Key: BEAM-9006
 URL: https://issues.apache.org/jira/browse/BEAM-9006
 Project: Beam
  Issue Type: Bug
  Components: java-fn-execution
Reporter: sunjincheng
Assignee: sunjincheng
 Fix For: 2.18.0


Currently the class `ProcessManager` will add a shutdown hook to stop all the 
living processes before JVM exits. The shutdown hook will never be removed. If 
this class is loaded by the user class loader, it will cause the user class 
loader could not be garbage collected which causes meta space memory leak 
eventually.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9005) Go SDK post-commit failures due to https://github.com/apache/beam/pull/10183

2019-12-19 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath updated BEAM-9005:

Summary: Go SDK post-commit  failures due to 
https://github.com/apache/beam/pull/10183  (was: beam_PostCommit_Go_VR_Flink  
failure due to https://github.com/apache/beam/pull/10183)

> Go SDK post-commit  failures due to https://github.com/apache/beam/pull/10183
> -
>
> Key: BEAM-9005
> URL: https://issues.apache.org/jira/browse/BEAM-9005
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Critical
>
> Looking into this.
>  
> cc: [~bhulette] [~lostluck] [~danoliveira]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9005) beam_PostCommit_Go_VR_Flink failure due to https://github.com/apache/beam/pull/10183

2019-12-19 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath updated BEAM-9005:

Issue Type: Bug  (was: Improvement)

> beam_PostCommit_Go_VR_Flink  failure due to 
> https://github.com/apache/beam/pull/10183
> -
>
> Key: BEAM-9005
> URL: https://issues.apache.org/jira/browse/BEAM-9005
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Critical
>
> Looking into this.
>  
> cc: [~bhulette] [~lostluck] [~danoliveira]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9005) beam_PostCommit_Go_VR_Flink failure due to https://github.com/apache/beam/pull/10183

2019-12-19 Thread Chamikara Madhusanka Jayalath (Jira)
Chamikara Madhusanka Jayalath created BEAM-9005:
---

 Summary: beam_PostCommit_Go_VR_Flink  failure due to 
https://github.com/apache/beam/pull/10183
 Key: BEAM-9005
 URL: https://issues.apache.org/jira/browse/BEAM-9005
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Chamikara Madhusanka Jayalath
Assignee: Chamikara Madhusanka Jayalath


Looking into this.

 

cc: [~bhulette] [~lostluck] [~danoliveira]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8944?focusedWorklogId=361501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-361501
 ]

ASF GitHub Bot logged work on BEAM-8944:


Author: ASF GitHub Bot
Created on: 20/Dec/19 00:55
Start Date: 20/Dec/19 00:55
Worklog Time Spent: 10m 
  Work Description: y1chi commented on pull request #10430: [BEAM-8944] 
Change to use single thread in py sdk bundle progress rep…
URL: https://github.com/apache/beam/pull/10430
 
 
   …ort (#10387)
   
   * [BEAM-8944] Change to single thread executor in python sdk bundle progress 
report
   
   (cherry picked from commit 794e58d8089c7013eb14475f1768372a26b7b386)
   
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://b

[jira] [Resolved] (BEAM-8269) IOTypehints.from_callable doesn't convert native type hints to Beam

2019-12-19 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-8269.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> IOTypehints.from_callable doesn't convert native type hints to Beam
> ---
>
> Key: BEAM-8269
> URL: https://issues.apache.org/jira/browse/BEAM-8269
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Users typically write type hints using typing module types. We should allow 
> that, be internally convert these type to Beam module types for now.
> In the future, Beam should stop using these internal types (BEAM-8156).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9004) Update Mockito Matchers usage to ArgumentMatchers since Matchers is deprecated in Mockito 2

2019-12-19 Thread Luke Cwik (Jira)
Luke Cwik created BEAM-9004:
---

 Summary: Update Mockito Matchers usage to ArgumentMatchers since 
Matchers is deprecated in Mockito 2
 Key: BEAM-9004
 URL: https://issues.apache.org/jira/browse/BEAM-9004
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core, testing
Reporter: Luke Cwik






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8976) No default logging story for Pipeline construction time in Python

2019-12-19 Thread Pablo Estrada (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-8976.
-
Resolution: Fixed

> No default logging story for Pipeline construction time in Python
> -
>
> Key: BEAM-8976
> URL: https://issues.apache.org/jira/browse/BEAM-8976
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.18.0
>
>
> With changes to logging, no logging is happening on the root loggers, and 
> thus, no basic setup is being done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-106) Native support for conditional iteration

2019-12-19 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000494#comment-17000494
 ] 

Luke Cwik edited comment on BEAM-106 at 12/20/19 12:04 AM:
---

[~nishant009]:

Does using a glob expression for the path work when specifying a [file based 
read 
transform|https://beam.apache.org/releases/pydoc/2.2.0/apache_beam.io.filebasedsource.html]
 (e.g. gs://bucket/a/b/c/** for the file_pattern)?

 

 


was (Author: lcwik):
[~nishant009]:

Does using a glob expression for the path work when specifying a [file based 
read 
transform|https://beam.apache.org/releases/pydoc/2.2.0/apache_beam.io.filebasedsource.html]
 (e.g. gs://bucket/a/b/c/**)?

 

 

> Native support for conditional iteration
> 
>
> Key: BEAM-106
> URL: https://issues.apache.org/jira/browse/BEAM-106
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Luke Cwik
>Priority: Major
>
> Ported from: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/50
> There are a variety of use cases which would benefit from native support for 
> conditional iteration.
> For instance, 
> http://stackoverflow.com/questions/31654421/conditional-iterations-in-google-cloud-dataflow/31659923?noredirect=1#comment51264604_31659923
>  asks about being able to write a loop like the following:
> {code}
> PCollection data  = ...
> while(needsMoreWork(data)) {
>   data = doAStep(data)
> }
> {code}
> If there are specific use cases please let us know the details. In the future 
> we will use this issue to post progress updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-106) Native support for conditional iteration

2019-12-19 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000494#comment-17000494
 ] 

Luke Cwik commented on BEAM-106:


[~nishant009]:

Does using a glob expression for the path work when specifying a [file based 
read 
transform|https://beam.apache.org/releases/pydoc/2.2.0/apache_beam.io.filebasedsource.html]
 (e.g. gs://bucket/a/b/c/**)?

 

 

> Native support for conditional iteration
> 
>
> Key: BEAM-106
> URL: https://issues.apache.org/jira/browse/BEAM-106
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Luke Cwik
>Priority: Major
>
> Ported from: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/50
> There are a variety of use cases which would benefit from native support for 
> conditional iteration.
> For instance, 
> http://stackoverflow.com/questions/31654421/conditional-iterations-in-google-cloud-dataflow/31659923?noredirect=1#comment51264604_31659923
>  asks about being able to write a loop like the following:
> {code}
> PCollection data  = ...
> while(needsMoreWork(data)) {
>   data = doAStep(data)
> }
> {code}
> If there are specific use cases please let us know the details. In the future 
> we will use this issue to post progress updates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8968) portableWordCount test for Spark/Flink failing: jar not found

2019-12-19 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000492#comment-17000492
 ] 

Valentyn Tymofieiev commented on BEAM-8968:
---

This is still failing on 
[https://github.com/apache/beam/pull/10378.|https://github.com/apache/beam/pull/10378]

 

> portableWordCount test for Spark/Flink failing: jar not found
> -
>
> Key: BEAM-8968
> URL: https://issues.apache.org/jira/browse/BEAM-8968
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: currently-failing, portability-flink, portability-spark, 
> test-failure
>
> This affects portableWordCountSparkRunnerBatch, 
> portableWordCountFlinkRunnerBatch, and portableWordCountFlinkRunnerStreaming.
> 22:43:23 RuntimeError: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib/runners/flink/1.9/job-server/build/libs/beam-runners-flink-1.9-job-server-2.19.0-SNAPSHOT.jar
>  not found. Please build the server with 
> 22:43:23   cd 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib;
>  ./gradlew runners:flink:1.9:job-server:shadowJar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9002) test_flatten_same_pcollections (apache_beam.transforms.ptransform_test.PTransformTest) does not work in Streaming VR suite on Dataflow

2019-12-19 Thread wendy liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000488#comment-17000488
 ] 

wendy liu commented on BEAM-9002:
-

Hi Valentyn, I'm OK with sickbay it, but I don't think I'm the right one to 
triage it though. Shall we have a person from SDK to take a look?

> test_flatten_same_pcollections 
> (apache_beam.transforms.ptransform_test.PTransformTest) does not work in 
> Streaming VR suite on Dataflow
> --
>
> Key: BEAM-9002
> URL: https://issues.apache.org/jira/browse/BEAM-9002
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: wendy liu
>Priority: Major
>
> Per investigation in https://issues.apache.org/jira/browse/BEAM-8877, the 
> test times out and was recently added to VR test suite.
> [~liumomo315], I will sickbay this test for streaming, could you please help 
> triage the failure?
> Thank you!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-19 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000487#comment-17000487
 ] 

Valentyn Tymofieiev commented on BEAM-8877:
---

Thanks, [~liumomo315]. I filed individual failures for the two tests: 
BEAM-9002, BEAM-9003. It would we great if you could help find the owner/next 
AIs. Thank you! 

Sent [https://github.com/apache/beam/pull/10429] to sickbay failing tests.

> beam_PostCommit_Py_VR_Dataflow is timing out
> 
>
> Key: BEAM-8877
> URL: https://issues.apache.org/jira/browse/BEAM-8877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, test-failures
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Error:
> 06:47:45 Build timed out (after 100 minutes). Marking the build as aborted.
> 06:47:45 Build was aborted
> Log: 
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5214/console]
>  
> Should we increase the timeout here similar to : 
> [https://github.com/apache/beam/pull/10234]
> cc: [~Ardagan]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8877?focusedWorklogId=361497&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-361497
 ]

ASF GitHub Bot logged work on BEAM-8877:


Author: ASF GitHub Bot
Created on: 19/Dec/19 23:50
Start Date: 19/Dec/19 23:50
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #10429: [BEAM-8877] 
Sickbay VR tests that don't pass.
URL: https://github.com/apache/beam/pull/10429
 
 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_

[jira] [Created] (BEAM-9003) test_reshuffle_preserves_timestamps (apache_beam.transforms.util_test.ReshuffleTest) does not work in Streaming VR suite on Dataflow

2019-12-19 Thread Valentyn Tymofieiev (Jira)
Valentyn Tymofieiev created BEAM-9003:
-

 Summary: test_reshuffle_preserves_timestamps 
(apache_beam.transforms.util_test.ReshuffleTest) does not work in Streaming VR 
suite on Dataflow
 Key: BEAM-9003
 URL: https://issues.apache.org/jira/browse/BEAM-9003
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow, sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: wendy liu


Per investigation in https://issues.apache.org/jira/browse/BEAM-8877, the test 
times out and was recently added to VR test suite.

[~liumomo315], I will sickbay this test for streaming, could you please help 
triage the failure?

Thank you!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9002) test_flatten_same_pcollections (apache_beam.transforms.ptransform_test.PTransformTest) does not work in Streaming VR suite on Dataflow

2019-12-19 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-9002:
--
Description: 
Per investigation in https://issues.apache.org/jira/browse/BEAM-8877, the test 
times out and was recently added to VR test suite.

[~liumomo315], I will sickbay this test for streaming, could you please help 
triage the failure?

Thank you!

  was:
Per investigation in https://issues.apache.org/jira/browse/BEAM-8877, the test 
times out and was recently added as to VR test suite.

[~liumomo315], I will sickbay this test for streaming, could you please help 
triage the failure?

Thank you!


> test_flatten_same_pcollections 
> (apache_beam.transforms.ptransform_test.PTransformTest) does not work in 
> Streaming VR suite on Dataflow
> --
>
> Key: BEAM-9002
> URL: https://issues.apache.org/jira/browse/BEAM-9002
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: wendy liu
>Priority: Major
>
> Per investigation in https://issues.apache.org/jira/browse/BEAM-8877, the 
> test times out and was recently added to VR test suite.
> [~liumomo315], I will sickbay this test for streaming, could you please help 
> triage the failure?
> Thank you!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9002) test_flatten_same_pcollections (apache_beam.transforms.ptransform_test.PTransformTest) does not work in Streaming VR suite on Dataflow

2019-12-19 Thread Valentyn Tymofieiev (Jira)
Valentyn Tymofieiev created BEAM-9002:
-

 Summary: test_flatten_same_pcollections 
(apache_beam.transforms.ptransform_test.PTransformTest) does not work in 
Streaming VR suite on Dataflow
 Key: BEAM-9002
 URL: https://issues.apache.org/jira/browse/BEAM-9002
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Valentyn Tymofieiev
Assignee: wendy liu


Per investigation in https://issues.apache.org/jira/browse/BEAM-8877, the test 
times out and was recently added as to VR test suite.

[~liumomo315], I will sickbay this test for streaming, could you please help 
triage the failure?

Thank you!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8994) add goVet to goPreCommit

2019-12-19 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-8994:

Labels: easy newbie starter  (was: )

> add goVet to goPreCommit
> 
>
> Key: BEAM-8994
> URL: https://issues.apache.org/jira/browse/BEAM-8994
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Udi Meiri
>Priority: Major
>  Labels: easy, newbie, starter
>
> Running all "goVet" tasks in goPreCommit will catch missing dependencies in 
> the gogradle lock file.
> For example: https://issues.apache.org/jira/browse/BEAM-8992



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread Yichi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000448#comment-17000448
 ] 

Yichi Zhang commented on BEAM-8944:
---

then yeah, it'll affect python streaming jobs (which is only on portable runner 
with fnapi).

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000445#comment-17000445
 ] 

Udi Meiri commented on BEAM-8944:
-

The point of my question is to figure whether this issue is a 2.18 blocker or 
not, and how it affects users.

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8695) Beam Dependency Update Request: com.google.http-client:google-http-client

2019-12-19 Thread Tomo Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000402#comment-17000402
 ] 

Tomo Suzuki edited comment on BEAM-8695 at 12/19/19 10:03 PM:
--

Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

The assertion fails because the actual is HashMap and {{GenericData.equals}} 
checks other object's class in google-http-client 1.34.0:

{code:java}
// GenericData equals
...
if (o == null || !(o instanceof GenericData)) {
  return false;
}
{code}

This {{equals}} method has been added in February 
([PR#589|https://github.com/googleapis/google-http-java-client/pull/589]) and 
released as v1.29.0.

Initially I thought wrapping expected {{CloubObject}} value with 
{{ImmutableMap.copyOf}}, but it does not solve equality on its elements.

 !image-2019-12-19-17-03-44-487.png! 




was (Author: suztomo):
Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

The assertion fails because the actual is HashMap and {{GenericData.equals}} 
checks other object's class in google-http-client 1.34.0:

{code:java}
// GenericData equals
...
if (o == null || !(o instanceof GenericData)) {
  return false;
}
{code}

This {{equals}} method has been added in February 
([PR#589|https://github.com/googleapis/google-http-java-client/pull/589]) and 
released as v1.29.0.


> Beam Dependency Update Request: com.google.http-client:google-http-client
> -
>
> Key: BEAM-8695
> URL: https://issues.apache.org/jira/browse/BEAM-8695
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Attachments: 
> LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png, 
> image-2019-12-19-17-03-44-487.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:40:13.570557 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:06:20.477284 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:12:12.146269 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:11:24.693912 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-19 Thread wendy liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000444#comment-17000444
 ] 

wendy liu commented on BEAM-8877:
-

I think the test_flatten_same_pcollections should be supported by both 
Streaming and Batch. Not sure if this is a bug.

Don't have a good sense on the other test test_reshuffle_preserves_timestamps 
though. Will let Robert comment.

> beam_PostCommit_Py_VR_Dataflow is timing out
> 
>
> Key: BEAM-8877
> URL: https://issues.apache.org/jira/browse/BEAM-8877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, test-failures
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Critical
>
> Error:
> 06:47:45 Build timed out (after 100 minutes). Marking the build as aborted.
> 06:47:45 Build was aborted
> Log: 
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5214/console]
>  
> Should we increase the timeout here similar to : 
> [https://github.com/apache/beam/pull/10234]
> cc: [~Ardagan]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000443#comment-17000443
 ] 

Udi Meiri commented on BEAM-8944:
-

By current I don't mean the current Beam release but the current 
production-ready runners (for example IIUC portability on Dataflow is not 
production ready).

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8695) Beam Dependency Update Request: com.google.http-client:google-http-client

2019-12-19 Thread Tomo Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000402#comment-17000402
 ] 

Tomo Suzuki edited comment on BEAM-8695 at 12/19/19 9:37 PM:
-

Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

The assertion fails because the actual is HashMap and {{GenericData.equals}} 
checks other object's class in google-http-client 1.34.0:

{code:java}
// GenericData equals
...
if (o == null || !(o instanceof GenericData)) {
  return false;
}
{code}

This {{equals}} method has been added in February 
([PR#589|https://github.com/googleapis/google-http-java-client/pull/589]) and 
released as v1.29.0.



was (Author: suztomo):
Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

The assertion fails because the actual is HashMap and {{GenericData.equals}} 
checks other object's class in google-http-client 1.34.0:

{code:java}
// GenericData equals
...
if (o == null || !(o instanceof GenericData)) {
  return false;
}
{code}



> Beam Dependency Update Request: com.google.http-client:google-http-client
> -
>
> Key: BEAM-8695
> URL: https://issues.apache.org/jira/browse/BEAM-8695
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Attachments: 
> LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:40:13.570557 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:06:20.477284 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:12:12.146269 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:11:24.693912 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8695) Beam Dependency Update Request: com.google.http-client:google-http-client

2019-12-19 Thread Tomo Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000402#comment-17000402
 ] 

Tomo Suzuki edited comment on BEAM-8695 at 12/19/19 9:25 PM:
-

Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

The assertion fails because the actual is HashMap and {{GenericData.equals}} 
checks other object's class in google-http-client 1.34.0:

{code:java}
// GenericData equals
...
if (o == null || !(o instanceof GenericData)) {
  return false;
}
{code}




was (Author: suztomo):
Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

> Beam Dependency Update Request: com.google.http-client:google-http-client
> -
>
> Key: BEAM-8695
> URL: https://issues.apache.org/jira/browse/BEAM-8695
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Attachments: 
> LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:40:13.570557 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:06:20.477284 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:12:12.146269 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:11:24.693912 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8695) Beam Dependency Update Request: com.google.http-client:google-http-client

2019-12-19 Thread Tomo Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000402#comment-17000402
 ] 

Tomo Suzuki commented on BEAM-8695:
---

Failures on 
{{org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest}} 
seems to be related to handling equality with maps.

 !LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png! 

> Beam Dependency Update Request: com.google.http-client:google-http-client
> -
>
> Key: BEAM-8695
> URL: https://issues.apache.org/jira/browse/BEAM-8695
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Attachments: 
> LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:40:13.570557 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:06:20.477284 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:12:12.146269 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:11:24.693912 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8695) Beam Dependency Update Request: com.google.http-client:google-http-client

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-8695:
--
Attachment: LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png

> Beam Dependency Update Request: com.google.http-client:google-http-client
> -
>
> Key: BEAM-8695
> URL: https://issues.apache.org/jira/browse/BEAM-8695
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
> Attachments: 
> LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders.png
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:40:13.570557 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:06:20.477284 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:12:12.146269 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:11:24.693912 
> -
> Please consider upgrading the dependency 
> com.google.http-client:google-http-client. 
> The current version is 1.28.0. The latest version is 1.33.0 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9001) Allow setting environment ID to all transforms in the SDK

2019-12-19 Thread Chamikara Madhusanka Jayalath (Jira)
Chamikara Madhusanka Jayalath created BEAM-9001:
---

 Summary: Allow setting environment ID to all transforms in the SDK
 Key: BEAM-9001
 URL: https://issues.apache.org/jira/browse/BEAM-9001
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core, sdk-java-harness, sdk-py-core, 
sdk-py-harness
Reporter: Chamikara Madhusanka Jayalath


Currently Beam SDKs set environment in a known set of transforms and do not not 
set it in others. Runners expect certain transforms to not to resolve to an 
environment.

It might be cleaner to set environment in all transforms by default (at the 
SDKs) and allow runners to override this for transforms that are naively 
implemented in the corresponding runners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8695) Beam Dependency Update Request: com.google.http-client:google-http-client

2019-12-19 Thread Tomo Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999527#comment-16999527
 ] 

Tomo Suzuki edited comment on BEAM-8695 at 12/19/19 9:11 PM:
-

https://builds.apache.org/job/beam_PreCommit_Java_Commit/9288/#showFailuresLink 
{noformat}
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest.testWatermarkEmission[numTasks
 = 2; numSplits=2]
org.apache.beam.runners.dataflow.worker.fn.control.ElementCountMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.ElementCountMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.MSecMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidMSecMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.MSecMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidMSecMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.MeanByteCountMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.MeanByteCountMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.UserDistributionMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidUserMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.UserDistributionMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidUserMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.UserMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidUserMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.fn.control.UserMonitoringInfoToCounterUpdateTransformerTest.testTransformReturnsValidCounterUpdateWhenValidUserMonitoringInfoReceived
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForInstructionOutputNodeWithGrpcNodeSuccessor
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForLengthPrefixCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForSideInputInfos
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixParDoInstructionCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixInstructionOutputCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixWriteInstructionCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixAndReplaceUnknownCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixAndReplaceForRunnerNetwork
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForInstructionOutputNodeWithGrpcNodePredecessor
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixReadInstructionCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixUnknownCoders
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForInstructionOutputNodeWithGrpcNodeSuccessor
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForLengthPrefixCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForSideInputInfos
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixParDoInstructionCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixInstructionOutputCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixWriteInstructionCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixAndReplaceUnknownCoder
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixAndReplaceForRunnerNetwork
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixForInstructionOutputNodeWithGrpcNodePredecessor
org.apache.beam.runners.dataflow.worker.graph.LengthPrefixUnknownCodersTest.testLengthPrefixReadInstructionCoder
org.apache.beam.sdk.io.gcp.bigquery.BigQueryIOReadT

[jira] [Updated] (BEAM-8974) apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info is flaky

2019-12-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8974:
---
Fix Version/s: 2.18.0

> apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info
>  is flaky
> 
>
> Key: BEAM-8974
> URL: https://issues.apache.org/jira/browse/BEAM-8974
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test is failing at apache_beam/runners/worker/log_handler_test.py:110: 
> IndexError
> Added in https://github.com/apache/beam/pull/10292
> Sample job: [https://builds.apache.org/job/beam_PreCommit_Python_Cron/2160/]
> Console logs:
>  {noformat}
> 06:37:37 === FAILURES 
> ===
> 06:37:37 ___ FnApiLogRecordHandlerTest.test_exc_info 
> 
> 06:37:37 [gw1] linux2 -- Python 2.7.12 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/target/.tox-py27-gcp-pytest/py27-gcp-pytest/bin/python
> 06:37:37
> 06:37:37 self = 
>  testMethod=test_exc_info>
> 06:37:37
> 06:37:37 def test_exc_info(self):
> 06:37:37   try:
> 06:37:37 raise ValueError('some message')
> 06:37:37   except ValueError:
> 06:37:37 _LOGGER.error('some error', exc_info=True)
> 06:37:37
> 06:37:37   self.fn_log_handler.close()
> 06:37:37
> 06:37:37 > log_entry = 
> self.test_logging_service.log_records_received[0].log_entries[0]
> 06:37:37 E IndexError: list index out of range
> 06:37:37
> 06:37:37 apache_beam/runners/worker/log_handler_test.py:110: IndexError
> 06:37:37 - Captured stderr call 
> -
> 06:37:37 ERROR:apache_beam.runners.worker.log_handler_test:some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> 06:37:37 -- Captured log call 
> ---
> 06:37:37 ERROR
> apache_beam.runners.worker.log_handler_test:log_handler_test.py:106 some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8974) apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info is flaky

2019-12-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000368#comment-17000368
 ] 

Ismaël Mejía commented on BEAM-8974:


If I removed the tag probably it was because I misread that this was part of 
master aka 2.19.0. I was not aware of the cherry pick so putting the version 
back looks like the good thing to do.

> apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info
>  is flaky
> 
>
> Key: BEAM-8974
> URL: https://issues.apache.org/jira/browse/BEAM-8974
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test is failing at apache_beam/runners/worker/log_handler_test.py:110: 
> IndexError
> Added in https://github.com/apache/beam/pull/10292
> Sample job: [https://builds.apache.org/job/beam_PreCommit_Python_Cron/2160/]
> Console logs:
>  {noformat}
> 06:37:37 === FAILURES 
> ===
> 06:37:37 ___ FnApiLogRecordHandlerTest.test_exc_info 
> 
> 06:37:37 [gw1] linux2 -- Python 2.7.12 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/target/.tox-py27-gcp-pytest/py27-gcp-pytest/bin/python
> 06:37:37
> 06:37:37 self = 
>  testMethod=test_exc_info>
> 06:37:37
> 06:37:37 def test_exc_info(self):
> 06:37:37   try:
> 06:37:37 raise ValueError('some message')
> 06:37:37   except ValueError:
> 06:37:37 _LOGGER.error('some error', exc_info=True)
> 06:37:37
> 06:37:37   self.fn_log_handler.close()
> 06:37:37
> 06:37:37 > log_entry = 
> self.test_logging_service.log_records_received[0].log_entries[0]
> 06:37:37 E IndexError: list index out of range
> 06:37:37
> 06:37:37 apache_beam/runners/worker/log_handler_test.py:110: IndexError
> 06:37:37 - Captured stderr call 
> -
> 06:37:37 ERROR:apache_beam.runners.worker.log_handler_test:some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> 06:37:37 -- Captured log call 
> ---
> 06:37:37 ERROR
> apache_beam.runners.worker.log_handler_test:log_handler_test.py:106 some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-9000:
--
Description: 
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are ~30 comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}

Credit: Ben Whitehead.

h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}

A feature request to handle escaping double quotes via JacksonFactory: 
https://github.com/googleapis/google-http-java-client/issues/923


  was:
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}

Credit: Ben Whitehead.

h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);

[jira] [Updated] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-9000:
--
Description: 
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}

Credit: Ben Whitehead.

h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}

A feature request to handle escaping double quotes via JacksonFactory: 
https://github.com/googleapis/google-http-java-client/issues/923


  was:
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}

Credit: Ben Whitehead.

h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);

[jira] [Updated] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-9000:
--
Description: 
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}

Credit: Ben Whitehead.

h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}

  was:
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}


> Java Test Assertions wit

[jira] [Commented] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Alan Myrvold (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000346#comment-17000346
 ] 

Alan Myrvold commented on BEAM-8195:


Also, Quota for Job creation requests per minute per user has been increased to 
240 for apache-beam-testing.

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8960) Add an option for user to be able to opt out of using insert id for BigQuery streaming insert.

2019-12-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8960?focusedWorklogId=361485&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-361485
 ]

ASF GitHub Bot logged work on BEAM-8960:


Author: ASF GitHub Bot
Created on: 19/Dec/19 20:24
Start Date: 19/Dec/19 20:24
Worklog Time Spent: 10m 
  Work Description: yirutang commented on pull request #10427: [BEAM-8960]: 
Add an option for user to opt out of using insert id for BigQuery streaming 
insert.
URL: https://github.com/apache/beam/pull/10427
 
 
   Expose an option so that user can opt out of using insert id while streaming 
into BigQuery. Insert id only guarantees best effort insert rows deduplication, 
without it, user will be able to opt into using new streaming backend with 
higher quotas and reliabilities.
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBui

[jira] [Commented] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Alan Myrvold (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000345#comment-17000345
 ] 

Alan Myrvold commented on BEAM-8195:


Yes, dataflow_v1b3_client.py is auto generated, so best not to change it there.

The num_retries can be set after constructing the client.

 

After 
https://github.com/apache/beam/blob/26f6dd58b9fe608476ccc33601b2e26fc0343080/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L465

self._client = dataflow.DataflowV1b3(...

)

self._client.num_retries = 0

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread Yichi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000341#comment-17000341
 ] 

Yichi Zhang edited comment on BEAM-8944 at 12/19/19 8:14 PM:
-

This bug doesn't affect current production runners since the change of using 
more threads in SDK Harness doesn't exist in current released beam versions 
(the Dataflow runner issue mentioned in #10387 TODO affect current production 
runners but has limited impact with this fix, and will be investigated later).


was (Author: yichi):
This bug doesn't affect current production runners since the change of using 
more threads in SDK Harness doesn't exist in current released beam versions 
(the Dataflow runner issue mentioned in #10387 TODO affect current production 
runners but has limited impact, and will be investigated later).

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread Yichi Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000341#comment-17000341
 ] 

Yichi Zhang commented on BEAM-8944:
---

This bug doesn't affect current production runners since the change of using 
more threads in SDK Harness doesn't exist in current released beam versions 
(the Dataflow runner issue mentioned in #10387 TODO affect current production 
runners but has limited impact, and will be investigated later).

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-19 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000340#comment-17000340
 ] 

Valentyn Tymofieiev commented on BEAM-8877:
---

Both tests have recently been marked as ValidatesRunner but were not previously 
marked so:

[https://github.com/apache/beam/commit/eb81905525719101861ff46657128d79d0115dc2]

[https://github.com/apache/beam/commit/d8e43af4c4b8f471dc980144d205f82aaeca1834]

I think these VR pipelines are simply not supported on Dataflow in streaming 
mode (which uses fnapi).

CC: [~liumomo315] [~robertwb] 

> beam_PostCommit_Py_VR_Dataflow is timing out
> 
>
> Key: BEAM-8877
> URL: https://issues.apache.org/jira/browse/BEAM-8877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, test-failures
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Critical
>
> Error:
> 06:47:45 Build timed out (after 100 minutes). Marking the build as aborted.
> 06:47:45 Build was aborted
> Log: 
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5214/console]
>  
> Should we increase the timeout here similar to : 
> [https://github.com/apache/beam/pull/10234]
> cc: [~Ardagan]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8825) OOM when writing large numbers of 'narrow' rows

2019-12-19 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000338#comment-17000338
 ] 

Ahmet Altay commented on BEAM-8825:
---

Could this issue be closed?

> OOM when writing large numbers of 'narrow' rows
> ---
>
> Key: BEAM-8825
> URL: https://issues.apache.org/jira/browse/BEAM-8825
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.9.0, 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 
> 2.16.0, 2.17.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> SpannerIO can OOM when writing large numbers of 'narrow' rows. 
>  
> SpannerIO puts  input mutation elements into batches for efficient writing.
> These batches are limited by number of cells mutated, and size of data 
> written (5000 cells, 1MB data). SpannerIO groups enough mutations to build 
> 1000 of these groups (5M cells, 1GB data), then sorts and batches them.
> When the number of cells and size of data is very small (<5 cells, <100 
> bytes), the memory overhead of storing millions of mutations for batching is 
> significant, and can lead to OOMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-19 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000335#comment-17000335
 ] 

Valentyn Tymofieiev commented on BEAM-8877:
---

 

Ran gradlew :sdks:python:test-suites:dataflow:py2:validatesRunnerStreamingTests 
locally, got following errors:

 

TimedOutException: 'test_reshuffle_preserves_timestamps 
(apache_beam.transforms.util_test.ReshuffleTest)

 

TimedOutException: 'test_flatten_same_pcollections 
(apache_beam.transforms.ptransform_test.PTransformTest)'

 

BUILD FAILED in 1h 31m 2s

 

There is also a  nosetests-validatesRunnerStreamingTests-df.xml with useful 
output, which should make it's way to Jenkins console if we tune the timeouts 
so that we hit test timeouts instead of Jenkins timeouts. 

> beam_PostCommit_Py_VR_Dataflow is timing out
> 
>
> Key: BEAM-8877
> URL: https://issues.apache.org/jira/browse/BEAM-8877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, test-failures
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Critical
>
> Error:
> 06:47:45 Build timed out (after 100 minutes). Marking the build as aborted.
> 06:47:45 Build was aborted
> Log: 
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5214/console]
>  
> Should we increase the timeout here similar to : 
> [https://github.com/apache/beam/pull/10234]
> cc: [~Ardagan]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomo Suzuki updated BEAM-9000:
--
Description: 
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

They are subclasses of \{{com.google.api.client.json.GenericJson}}. 

Several options to enhance these assertions.
h1. Option 1: Assertion using Map

Leveraging the fact that GenericJson is a subclass of AbstractMap, the assertion can be written as
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
h1. Option 2: Create assertEqualsOnJson

Leveraging the fact that instance of GenericJson can be instantiated through 
JSON, the assertion can be written as
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
 

{{assertEqualsOnJson}} is implemented as below. The following field and methods 
should go to shared test utility class (sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}

  was:
As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

Several options to enhance these assertions.
h1. Option 1: Assertion using Map
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
h1. Option 2: Create assertEqualsOnJson
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
The following field and methods should go to shared test utility class 
(sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}


> Java Test Assertions without toString for GenericJson subclasses
> 
>
> Key: BEAM-9000
> URL: https://issues.apache.org/jira/browse/BEAM-9000
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Tomo Suzuki
>Assigne

[jira] [Commented] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000330#comment-17000330
 ] 

Ahmet Altay commented on BEAM-8195:
---

dataflow_v1b3_client.py is an auto generated file I believe. Is it possible to 
disable these retries at runtime by calling an API or something similar?

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9000) Java Test Assertions without toString for GenericJson subclasses

2019-12-19 Thread Tomo Suzuki (Jira)
Tomo Suzuki created BEAM-9000:
-

 Summary: Java Test Assertions without toString for GenericJson 
subclasses
 Key: BEAM-9000
 URL: https://issues.apache.org/jira/browse/BEAM-9000
 Project: Beam
  Issue Type: Improvement
  Components: testing
Reporter: Tomo Suzuki
Assignee: Tomo Suzuki


As of now, there are many tests that assert on {{toString()}} of objects.
{code:java}
CounterUpdate result = testObject.transform(monitoringInfo);
assertEquals(
"{cumulative=true, integer={highBits=0, lowBits=0}, "
+ "nameAndKind={kind=SUM, "
+ "name=transformedValue-ElementCount}}",
result.toString());
{code}
This style is prone to unnecessary maintenance of the test code when upgrading 
dependencies. Dependencies may change the internal ordering of fields and 
trivial change in {{toString()}}. In BEAM-8695, where I tried to upgrade 
google-http-client, there are many comparison failure due to this {{toString}} 
assertions.

Several options to enhance these assertions.
h1. Option 1: Assertion using Map
{code:java}
ImmutableMap expected = ImmutableMap.of("cumulative", true,
"integer", ImmutableMap.of("highBits", 0, "lowBits", 0),
"nameAndKind", ImmutableMap.of("kind", "SUM", "name", 
"transformedValue-ElementCount"));

assertEquals(expected, (Map)result);
{code}
h1. Option 2: Create assertEqualsOnJson
{code:java}
assertEqualsOnJson(
"{\"cumulative\":true, \"integer\":{\"highBits\":0, \"lowBits\":0}, "
+ "\"nameAndKind\":{\"kind\":\"SUM\", "
+ "\"name\":\"transformedValue-ElementCount\"}}",
result);
{code}
The following field and methods should go to shared test utility class 
(sdks/testing?)
{code:java}
  private static final JacksonFactory jacksonFactory = 
JacksonFactory.getDefaultInstance();

  public static  void assertEqualsOnJson(String 
expectedJsonText, T actual) {
CounterUpdate expected = parse(expectedJsonText, CounterUpdate.class);
assertEquals(expected, actual);
  }

  public static  T parse(String text, Class clazz) {
try {
  JsonParser parser = jacksonFactory.createJsonParser(text);
  return parser.parse(clazz);
} catch (IOException ex) {
  throw new IllegalArgumentException("Could not parse the text as " + 
clazz, ex);
}
  }
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8977) apache_beam.runners.interactive.display.pcoll_visualization_test.PCollectionVisualizationTest.test_dynamic_plotting_update_same_display is flaky

2019-12-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8977?focusedWorklogId=361482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-361482
 ]

ASF GitHub Bot logged work on BEAM-8977:


Author: ASF GitHub Bot
Created on: 19/Dec/19 19:33
Start Date: 19/Dec/19 19:33
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on pull request #10404: [BEAM-8977] 
Resolve test flakiness
URL: https://github.com/apache/beam/pull/10404
 
 
   1. Removed test logic depending on execution of asynchronous tasks since
   there is no control of them in a testing environment.
   2. Replaced the dynamic plotting tests with tests directly/indirectly
   invoking underlying logic of the asynchronous task.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_

[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000313#comment-17000313
 ] 

Udi Meiri commented on BEAM-8944:
-

Does this bug affect current production runners (such as Dataflow runner 
mention in the TODO in #10387)?

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Alan Myrvold (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000297#comment-17000297
 ] 

Alan Myrvold edited comment on BEAM-8195 at 12/19/19 7:12 PM:
--

I also see inconsistent retry behavior between the Python SDK and Java SDK.

For 429 errors, Python retries 4 times and Java retries 5 times.

For 503 errors, Python retries 19 times and Java retries 5 times.

Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.

The extra Python retries are in apitools/base/py/base_api.py and can be 
disabled in 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
to allow beam to handle the retries.


 To verify, ran:
 {{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
 or
 {{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
--dataflow_endpoint=[http://localhost:1500|http://localhost:1500/] (Python) or 
--dataflowEndpoint=[http://localhost:1500|http://localhost:1500/] (java)


was (Author: alanmyrvold):
I also see inconsistent retry behavior between the Python SDK and Java SDK.

For 429 errors, Python retries 4 times and Java retries 5 times.

For 503 errors, Python retries 19 times and Java retries 5 times.

Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.

Not sure where the extra python retry's are happening, but likely on the http 
transport level.
To verify, ran:
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
or
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
\-\-dataflow_endpoint=http://localhost:1500 (Python) or 
--dataflowEndpoint=http://localhost:1500 (java)

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8989) Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)

2019-12-19 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-8989:
---

Assignee: Reuven Lax

> Backwards incompatible change in ParDo.getSideInputs (caught by failure when 
> running Apache Nemo quickstart)
> 
>
> Key: BEAM-8989
> URL: https://issues.apache.org/jira/browse/BEAM-8989
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.16.0, 2.17.0, 2.18.0
>Reporter: Luke Cwik
>Assignee: Reuven Lax
>Priority: Critical
> Fix For: 2.19.0
>
>
> [PR/9275|https://github.com/apache/beam/pull/9275] changed 
> *ParDo.getSideInputs* from *List* to *Map PCollectionView>* which is backwards incompatible change and was released as 
> part of Beam 2.16.0 erroneously.
> Running the Apache Nemo Quickstart fails with:
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Translator private 
> static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27Exception in thread 
> "main" java.lang.RuntimeException: Translator private static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27 at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:113)
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineVisitor.visitPrimitiveTransform(PipelineVisitor.java:46)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
>  at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:80) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:31) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) at 
> org.apache.beam.examples.WordCount.runWordCount(WordCount.java:185) at 
> org.apache.beam.examples.WordCount.main(WordCount.java:192)Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:109)
>  ... 14 moreCaused by: java.lang.NoSuchMethodError: 
> org.apache.beam.sdk.transforms.ParDo$MultiOutput.getSideInputs()Ljava/util/List;
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(PipelineTranslator.java:236)
>  ... 19 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8989) Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)

2019-12-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000308#comment-17000308
 ] 

Udi Meiri commented on BEAM-8989:
-

Decided to push this back to 2.19 since 2.16 already has this regression and it 
seems to not have affected other usages.

> Backwards incompatible change in ParDo.getSideInputs (caught by failure when 
> running Apache Nemo quickstart)
> 
>
> Key: BEAM-8989
> URL: https://issues.apache.org/jira/browse/BEAM-8989
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.16.0, 2.17.0
>Reporter: Luke Cwik
>Priority: Critical
> Fix For: 2.18.0
>
>
> [PR/9275|https://github.com/apache/beam/pull/9275] changed 
> *ParDo.getSideInputs* from *List* to *Map PCollectionView>* which is backwards incompatible change and was released as 
> part of Beam 2.16.0 erroneously.
> Running the Apache Nemo Quickstart fails with:
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Translator private 
> static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27Exception in thread 
> "main" java.lang.RuntimeException: Translator private static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27 at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:113)
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineVisitor.visitPrimitiveTransform(PipelineVisitor.java:46)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
>  at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:80) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:31) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) at 
> org.apache.beam.examples.WordCount.runWordCount(WordCount.java:185) at 
> org.apache.beam.examples.WordCount.main(WordCount.java:192)Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:109)
>  ... 14 moreCaused by: java.lang.NoSuchMethodError: 
> org.apache.beam.sdk.transforms.ParDo$MultiOutput.getSideInputs()Ljava/util/List;
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(PipelineTranslator.java:236)
>  ... 19 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8989) Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)

2019-12-19 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-8989:

Affects Version/s: 2.18.0

> Backwards incompatible change in ParDo.getSideInputs (caught by failure when 
> running Apache Nemo quickstart)
> 
>
> Key: BEAM-8989
> URL: https://issues.apache.org/jira/browse/BEAM-8989
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.16.0, 2.17.0, 2.18.0
>Reporter: Luke Cwik
>Priority: Critical
> Fix For: 2.19.0
>
>
> [PR/9275|https://github.com/apache/beam/pull/9275] changed 
> *ParDo.getSideInputs* from *List* to *Map PCollectionView>* which is backwards incompatible change and was released as 
> part of Beam 2.16.0 erroneously.
> Running the Apache Nemo Quickstart fails with:
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Translator private 
> static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27Exception in thread 
> "main" java.lang.RuntimeException: Translator private static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27 at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:113)
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineVisitor.visitPrimitiveTransform(PipelineVisitor.java:46)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
>  at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:80) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:31) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) at 
> org.apache.beam.examples.WordCount.runWordCount(WordCount.java:185) at 
> org.apache.beam.examples.WordCount.main(WordCount.java:192)Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:109)
>  ... 14 moreCaused by: java.lang.NoSuchMethodError: 
> org.apache.beam.sdk.transforms.ParDo$MultiOutput.getSideInputs()Ljava/util/List;
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(PipelineTranslator.java:236)
>  ... 19 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8989) Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)

2019-12-19 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-8989:

Fix Version/s: (was: 2.18.0)
   2.19.0

> Backwards incompatible change in ParDo.getSideInputs (caught by failure when 
> running Apache Nemo quickstart)
> 
>
> Key: BEAM-8989
> URL: https://issues.apache.org/jira/browse/BEAM-8989
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.16.0, 2.17.0
>Reporter: Luke Cwik
>Priority: Critical
> Fix For: 2.19.0
>
>
> [PR/9275|https://github.com/apache/beam/pull/9275] changed 
> *ParDo.getSideInputs* from *List* to *Map PCollectionView>* which is backwards incompatible change and was released as 
> part of Beam 2.16.0 erroneously.
> Running the Apache Nemo Quickstart fails with:
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Translator private 
> static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27Exception in thread 
> "main" java.lang.RuntimeException: Translator private static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27 at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:113)
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineVisitor.visitPrimitiveTransform(PipelineVisitor.java:46)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
>  at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:80) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:31) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) at 
> org.apache.beam.examples.WordCount.runWordCount(WordCount.java:185) at 
> org.apache.beam.examples.WordCount.main(WordCount.java:192)Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:109)
>  ... 14 moreCaused by: java.lang.NoSuchMethodError: 
> org.apache.beam.sdk.transforms.ParDo$MultiOutput.getSideInputs()Ljava/util/List;
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(PipelineTranslator.java:236)
>  ... 19 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8999) PGBKCVOperation does not respect timestamp combiners

2019-12-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8999?focusedWorklogId=361481&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-361481
 ]

ASF GitHub Bot logged work on BEAM-8999:


Author: ASF GitHub Bot
Created on: 19/Dec/19 19:05
Start Date: 19/Dec/19 19:05
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #10425: [BEAM-8999] 
Respect timestamp combiners in PGBKCVOperation.
URL: https://github.com/apache/beam/pull/10425
 
 
   This fixes a correctness issue on non-direct runners.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam

[jira] [Comment Edited] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Alan Myrvold (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000297#comment-17000297
 ] 

Alan Myrvold edited comment on BEAM-8195 at 12/19/19 6:48 PM:
--

I also see inconsistent retry behavior between the Python SDK and Java SDK.

For 429 errors, Python retries 4 times and Java retries 5 times.

For 503 errors, Python retries 19 times and Java retries 5 times.

Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.

Not sure where the extra python retry's are happening, but likely on the http 
transport level.
To verify, ran:
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
or
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
\-\-dataflow_endpoint=http://localhost:1500 (Python) or 
--dataflowEndpoint=http://localhost:1500 (java)


was (Author: alanmyrvold):
I also see inconsistent retry behavior between the Python SDK and Java SDK.

For 429 errors, Python retries 4 times and Java retries 5 times.

For 503 errors, Python retries 19 times and Java retries 5 times.

Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.

Not sure where the extra python retry's are happening, but likely on the http 
transport level.
To verify, ran:
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
or
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
{{--dataflow_endpoint=http://localhost:1500 }}(Python) or 
{{--dataflowEndpoint=http://localhost:1500 }}(java)

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8999) PGBKCVOperation does not respect timestamp combiners

2019-12-19 Thread Robert Bradshaw (Jira)
Robert Bradshaw created BEAM-8999:
-

 Summary: PGBKCVOperation does not respect timestamp combiners
 Key: BEAM-8999
 URL: https://issues.apache.org/jira/browse/BEAM-8999
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: Robert Bradshaw


We prevent lifting in the FnAPI runner in this case, but other optimizers (e.g. 
the Greedy Fuser and Dataflow) do not, resulting in incorrect timestamps. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8882) Allow Dataflow to automatically choose portability or not.

2019-12-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000300#comment-17000300
 ] 

Udi Meiri commented on BEAM-8882:
-

Cherrypick has been merged to the release branch. Can this be closed?

> Allow Dataflow to automatically choose portability or not.
> --
>
> Key: BEAM-8882
> URL: https://issues.apache.org/jira/browse/BEAM-8882
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.18.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> We would like the Dataflow service to be able to automatically choose whether 
> to run pipelines in a portable way. In order to do this, we need to provide 
> more information even if portability is not explicitly requested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Alan Myrvold (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000297#comment-17000297
 ] 

Alan Myrvold edited comment on BEAM-8195 at 12/19/19 6:47 PM:
--

I also see inconsistent retry behavior between the Python SDK and Java SDK.

For 429 errors, Python retries 4 times and Java retries 5 times.

For 503 errors, Python retries 19 times and Java retries 5 times.

Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.

Not sure where the extra python retry's are happening, but likely on the http 
transport level.
To verify, ran:
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
or
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
{{--dataflow_endpoint=http://localhost:1500 }}(Python) or 
{{--dataflowEndpoint=http://localhost:1500 }}(java)


was (Author: alanmyrvold):
I also see inconsistent retry behavior between the Python SDK and Java SDK.
For 429 errors, Python retries 4 times and Java retries 5 times.For 503 errors, 
Python retries 19 times and Java retries 5 times.
Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.Not sure where the extra python retry's are 
happening, but likely on the http transport level.
To verify, ran:
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
or
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
--dataflow_endpoint=http://localhost:1500 (Python) or 
--dataflowEndpoint=http://localhost:1500 (java)

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Alan Myrvold (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000297#comment-17000297
 ] 

Alan Myrvold commented on BEAM-8195:


I also see inconsistent retry behavior between the Python SDK and Java SDK.
For 429 errors, Python retries 4 times and Java retries 5 times.For 503 errors, 
Python retries 19 times and Java retries 5 times.
Python's submit_job_description uses the default retry_on_server_errors_filter 
which does not retry on 4xx errors.Not sure where the extra python retry's are 
happening, but likely on the http transport level.
To verify, ran:
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 503 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}
or
{{a=0; while true ; do a=$((a+1)); echo -e "HTTP/1.1 429 Message\n" | nc -q 1 
-N -l -p 1500 > /dev/null  ; echo "req $a $(date)"; done}}

Locally, then ran dataflow pipeline with 
--dataflow_endpoint=http://localhost:1500 (Python) or 
--dataflowEndpoint=http://localhost:1500 (java)

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8991) RuntimeError in log_handler_test

2019-12-19 Thread Ning Kang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000296#comment-17000296
 ] 

Ning Kang commented on BEAM-8991:
-

Got it, thanks!

 

I've also experienced something interesting when running tests.

Say if you intentionally log something at warning level and if you do it in the 
__main__ scope, such warning log will fail tests when running on Jenkins. Some 
gradle tasks will just exit when they see warning level logs.

However, if you move the warning log to some local scope, say some constructor 
of some class, the warning log would not fail those gradle tasks.

> RuntimeError in log_handler_test
> 
>
> Key: BEAM-8991
> URL: https://issues.apache.org/jira/browse/BEAM-8991
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ning Kang
>Priority: Major
> Fix For: Not applicable
>
>
> Now is:
> {code:java}
> apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info
>  (from py27-gcp-pytest)Failing for the past 1 build (Since #1290 )Took 78 
> ms.Error MessageIndexError: list index out of rangeStacktraceself = 
>  testMethod=test_exc_info>
> def test_exc_info(self):
>   try:
> raise ValueError('some message')
>   except ValueError:
> _LOGGER.error('some error', exc_info=True)
> 
>   self.fn_log_handler.close()
> 
> > log_entry = 
> > self.test_logging_service.log_records_received[0].log_entries[0]
> E IndexError: list index out of range
> apache_beam/runners/worker/log_handler_test.py:110: IndexErrorStandard 
> ErrorERROR:apache_beam.runners.worker.log_handler_test:some error
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Phrase/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> raise ValueError('some message')
> ValueError: some message
> {code}
>  Marking it as a duplicate of BEAM-8974.
> Was:
> {code:java}
> 19:28:06 > Task :sdks:python:test-suites:tox:py35:testPy35Cython
> .Exception in thread Thread-1715:
> 19:28:06 Traceback (most recent call last):
> 19:28:06   File "apache_beam/runners/common.py", line 879, in 
> apache_beam.runners.common.DoFnRunner.process
> 19:28:06 return self.do_fn_invoker.invoke_process(windowed_value)
> 19:28:06   File "apache_beam/runners/common.py", line 495, in 
> apache_beam.runners.common.SimpleInvoker.invoke_process
> 19:28:06 windowed_value, self.process_method(windowed_value.value))
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit@2/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/transforms/core.py",
>  line 1434, in 
> 19:28:06 wrapper = lambda x: [fn(x)]
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit@2/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py",
>  line 620, in raise_error
> 19:28:06 raise RuntimeError('x')
> 19:28:06 RuntimeError: x
> 19:28:06 
> 19:28:06 During handling of the above exception, another exception occurred:
> 19:28:06 
> 19:28:06 Traceback (most recent call last):
> 19:28:06   File "/usr/lib/python3.5/threading.py", line 914, in 
> _bootstrap_inner
> 19:28:06 self.run()
> 19:28:06   File "/usr/lib/python3.5/threading.py", line 862, in run
> 19:28:06 self._target(*self._args, **self._kwargs)
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit@2/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/local_job_service.py",
>  line 270, in _run_job
> 19:28:06 self._pipeline_proto)
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit@2/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 461, in run_via_runner_api
> 19:28:06 return self.run_stages(stage_context, stages)
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit@2/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 553, in run_stages
> 19:28:06 stage_results.process_bundle.monitoring_infos)
> 19:28:06   File "/usr/lib/python3.5/contextlib.py", line 77, in __exit__
> 19:28:06 self.gen.throw(type, value, traceback)
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit@2/src/sdks/python/test-suites/tox/py35/build/srcs/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 500, in maybe_profile
> 19:28:06 yield
> 19:28:06   File 
> "/home/jenkins/jenkins-slave/w

[jira] [Commented] (BEAM-7405) Task :sdks:python:hdfsIntegrationTest is failing in Python PostCommits - docker-credential-gcloud not installed

2019-12-19 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000292#comment-17000292
 ] 

Brian Hulette commented on BEAM-7405:
-

This happened again in 
[https://builds.apache.org/job/beam_PostCommit_Python2/1258/]
Gradle build scan: [https://scans.gradle.com/s/wejnwzveleuxu]

{code}
:sdks:python:test-suites:direct:py2:hdfsIntegrationTest FAILED  
++ dirname 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh
  
+ 
TEST_DIR=/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test
   
+ 
ROOT_DIR=/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../..

+ 
CONTEXT_DIR=/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration
  
+ rm -r 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration

rm: cannot remove 
'/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration'

: No such file or directory 
+ true  
+ mkdir -p 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/sdks

+ cp 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/docker-compose.yml
 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/Dockerfile
 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/hdfscli.cfg
 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh
 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/
   
+ cp -r 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../sdks/python
 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/sdks/

+ cp -r 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../model
 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/
   
++ echo hdfs_IT-jenkins-beam_PostCommit_Python2-1258
+ PROJECT_NAME=hdfs_IT-jenkins-beam_PostCommit_Python2-1258 
+ '[' -z jenkins-beam_PostCommit_Python2-1258 ']'   
+ COLOR_OPT=--no-ansi   
+ COMPOSE_OPT='-p hdfs_IT-jenkins-beam_PostCommit_Python2-1258 --no-ansi'   
+ cd 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python2/src/sdks/python/apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration
   
+ docker network prune --force  
+ trap finally EXIT 
+ docker-compose -p hdfs_IT-jenkins-beam_PostCommit_Python2-1258 --no-ansi 
build --build-arg BASE_IMAGE=python:2
namenode uses an image, skipping
datanode uses an image, skipping
Building test   
[15171] Failed to execute script docker-compose 
Traceback (most recent call last):  
  File "bin/docker-compose", line 6, in 
  File "compose/cli/main.py", line 71, in main  
  File "compose/cli/main.py", line 127, in perform_command  
  File "compose/cli/main.py", line 287, in build
  File "compose/project.py", line 386, in build 
  File "compose/project.py", line 368, in build_service 
  File "compose/service.py", line 1084, in build
  File "site-packages/docker/api/build.py", line 260, in build  
  File "site-packages/docker/api/build.py", line 307, in _set_auth_headers  
  File "site-packages/docker/auth.py", line 310, in get_all_credentials 
  File "site-packages/docker/auth.py", line 262, in 
_resolve_authconfig_credstore   
  File "site-packages/docker/auth.py", line 287, in _get_store_instance 
  File "site-packages/dockerpycreds/store.py", line 25, in __init__ 
dockerpycreds.errors.InitializationError: docker-credential-gcloud not 
installed or not available in PATH   
+ finally   
+ docker-compose -p hdfs_IT-jenkins-beam_PostCommit_Python2-1258 --no-ansi down 
Removing network hdfs_it-jenkins-beam_postcommit_python2-1258_test_net  
Network hdfs_it-jenkins-beam_postcommit_python2-1258_test_net not found.
real 

[jira] [Reopened] (BEAM-7405) Task :sdks:python:hdfsIntegrationTest is failing in Python PostCommits - docker-credential-gcloud not installed

2019-12-19 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette reopened BEAM-7405:
-

> Task :sdks:python:hdfsIntegrationTest is failing in Python PostCommits - 
> docker-credential-gcloud not installed
> ---
>
> Key: BEAM-7405
> URL: https://issues.apache.org/jira/browse/BEAM-7405
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Yifan Zou
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> This failure happened on apache-beam-jenkins-14.
> {noformat}
> 18:47:03 > Task :sdks:python:hdfsIntegrationTest
> 18:47:03 ++ dirname 
> ./apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh
> 18:47:03 + TEST_DIR=./apache_beam/io/hdfs_integration_test
> 18:47:03 + ROOT_DIR=./apache_beam/io/hdfs_integration_test/../../../../..
> 18:47:03 + 
> CONTEXT_DIR=./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration
> 18:47:03 + rm -r 
> ./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration
> 18:47:03 rm: cannot remove 
> './apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration':
>  No such file or directory
> 18:47:03 + true
> 18:47:03 + mkdir -p 
> ./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/sdks
> 18:47:03 + cp ./apache_beam/io/hdfs_integration_test/docker-compose.yml 
> ./apache_beam/io/hdfs_integration_test/Dockerfile 
> ./apache_beam/io/hdfs_integration_test/hdfscli.cfg 
> ./apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh 
> ./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/
> 18:47:03 + cp -r 
> ./apache_beam/io/hdfs_integration_test/../../../../../sdks/python 
> ./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/sdks/
> 18:47:03 + cp -r ./apache_beam/io/hdfs_integration_test/../../../../../model 
> ./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration/
> 18:47:03 ++ echo hdfs_IT-jenkins-beam_PostCommit_Python_Verify_PR-714
> 18:47:03 + PROJECT_NAME=hdfs_IT-jenkins-beam_PostCommit_Python_Verify_PR-714
> 18:47:03 + '[' -z jenkins-beam_PostCommit_Python_Verify_PR-714 ']'
> 18:47:03 + COLOR_OPT=--no-ansi
> 18:47:03 + COMPOSE_OPT='-p 
> hdfs_IT-jenkins-beam_PostCommit_Python_Verify_PR-714 --no-ansi'
> 18:47:03 + cd 
> ./apache_beam/io/hdfs_integration_test/../../../../../build/hdfs_integration
> 18:47:03 + docker network prune --force
> 18:47:03 + trap finally EXIT
> 18:47:03 + docker-compose -p 
> hdfs_IT-jenkins-beam_PostCommit_Python_Verify_PR-714 --no-ansi build
> 18:47:03 namenode uses an image, skipping
> 18:47:03 datanode uses an image, skipping
> 18:47:03 Building test
> 18:47:03 [29234] Failed to execute script docker-compose
> 18:47:03 Traceback (most recent call last):
> 18:47:03   File "bin/docker-compose", line 6, in 
> 18:47:03   File "compose/cli/main.py", line 71, in main
> 18:47:03   File "compose/cli/main.py", line 127, in perform_command
> 18:47:03   File "compose/cli/main.py", line 287, in build
> 18:47:03   File "compose/project.py", line 386, in build
> 18:47:03   File "compose/project.py", line 368, in build_service
> 18:47:03   File "compose/service.py", line 1084, in build
> 18:47:03   File "site-packages/docker/api/build.py", line 260, in build
> 18:47:03   File "site-packages/docker/api/build.py", line 307, in 
> _set_auth_headers
> 18:47:03   File "site-packages/docker/auth.py", line 310, in 
> get_all_credentials
> 18:47:03   File "site-packages/docker/auth.py", line 262, in 
> _resolve_authconfig_credstore
> 18:47:03   File "site-packages/docker/auth.py", line 287, in 
> _get_store_instance
> 18:47:03   File "site-packages/dockerpycreds/store.py", line 25, in __init__
> 18:47:03 dockerpycreds.errors.InitializationError: docker-credential-gcloud 
> not installed or not available in PATH
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8976) No default logging story for Pipeline construction time in Python

2019-12-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000289#comment-17000289
 ] 

Udi Meiri commented on BEAM-8976:
-

Fix for 2.18.0 release in https://github.com/apache/beam/pull/10416
Okay to mark this as resolved?

> No default logging story for Pipeline construction time in Python
> -
>
> Key: BEAM-8976
> URL: https://issues.apache.org/jira/browse/BEAM-8976
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.18.0
>
>
> With changes to logging, no logging is happening on the root loggers, and 
> thus, no basic setup is being done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8998) Avoid excessive bundle progress polling in Dataflow Runner

2019-12-19 Thread Yichi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yichi Zhang updated BEAM-8998:
--
Description: 
Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
SDK Harness, and use the result to decide whether data transfer should be 
throttled. This can potentially overload SDK Harness. 

We should try to come up with a way to avoid the throttling and lower the 
bundle progress request frequency significantly.

 

Code reference:

frequency setting: 
[https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java#L296]

  was:
Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
SDK Harness, and use the result to decide whether data delivery should be 
throttled. This can potentially overload SDK Harness. 

We should try to come up with a way to avoid the throttling and lower the 
bundle progress request frequency significantly.

 

Code reference:

frequency setting: 
[https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java#L296]


> Avoid excessive bundle progress polling in Dataflow Runner
> --
>
> Key: BEAM-8998
> URL: https://issues.apache.org/jira/browse/BEAM-8998
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Yichi Zhang
>Priority: Major
>
> Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
> SDK Harness, and use the result to decide whether data transfer should be 
> throttled. This can potentially overload SDK Harness. 
> We should try to come up with a way to avoid the throttling and lower the 
> bundle progress request frequency significantly.
>  
> Code reference:
> frequency setting: 
> [https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java#L296]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8974) apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info is flaky

2019-12-19 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000283#comment-17000283
 ] 

Valentyn Tymofieiev commented on BEAM-8974:
---

These errors continue to happen. I see that [~iemejia] removed 2.18.0 fix 
version tag. The flake was introduced by a PR that was cherry-picked into the 
release branch. Are we confident that this flake does not need to be fixed in 
the release branch?

> apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info
>  is flaky
> 
>
> Key: BEAM-8974
> URL: https://issues.apache.org/jira/browse/BEAM-8974
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The test is failing at apache_beam/runners/worker/log_handler_test.py:110: 
> IndexError
> Added in https://github.com/apache/beam/pull/10292
> Sample job: [https://builds.apache.org/job/beam_PreCommit_Python_Cron/2160/]
> Console logs:
>  {noformat}
> 06:37:37 === FAILURES 
> ===
> 06:37:37 ___ FnApiLogRecordHandlerTest.test_exc_info 
> 
> 06:37:37 [gw1] linux2 -- Python 2.7.12 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/target/.tox-py27-gcp-pytest/py27-gcp-pytest/bin/python
> 06:37:37
> 06:37:37 self = 
>  testMethod=test_exc_info>
> 06:37:37
> 06:37:37 def test_exc_info(self):
> 06:37:37   try:
> 06:37:37 raise ValueError('some message')
> 06:37:37   except ValueError:
> 06:37:37 _LOGGER.error('some error', exc_info=True)
> 06:37:37
> 06:37:37   self.fn_log_handler.close()
> 06:37:37
> 06:37:37 > log_entry = 
> self.test_logging_service.log_records_received[0].log_entries[0]
> 06:37:37 E IndexError: list index out of range
> 06:37:37
> 06:37:37 apache_beam/runners/worker/log_handler_test.py:110: IndexError
> 06:37:37 - Captured stderr call 
> -
> 06:37:37 ERROR:apache_beam.runners.worker.log_handler_test:some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> 06:37:37 -- Captured log call 
> ---
> 06:37:37 ERROR
> apache_beam.runners.worker.log_handler_test:log_handler_test.py:106 some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8998) Avoid excessive bundle progress polling in Dataflow Runner

2019-12-19 Thread Yichi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yichi Zhang updated BEAM-8998:
--
Description: 
Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
SDK Harness, and use the result to decide whether data delivery should be 
throttled. This can potentially overload SDK Harness. 

We should try to come up with a way to avoid the throttling and lower the 
bundle progress request frequency significantly.

 

Code reference:

frequency setting: 
[https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java#L296]

  was:
Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
SDK Harness, and use the result to decide whether data delivery should be 
throttled. This can potentially overload SDK Harness. 

We should try to come up with a way to avoid the throttling and lower the 
bundle progress request frequency significantly.


> Avoid excessive bundle progress polling in Dataflow Runner
> --
>
> Key: BEAM-8998
> URL: https://issues.apache.org/jira/browse/BEAM-8998
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Yichi Zhang
>Priority: Major
>
> Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
> SDK Harness, and use the result to decide whether data delivery should be 
> throttled. This can potentially overload SDK Harness. 
> We should try to come up with a way to avoid the throttling and lower the 
> bundle progress request frequency significantly.
>  
> Code reference:
> frequency setting: 
> [https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java#L296]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8998) Avoid excessive bundle progress polling in Dataflow Runner

2019-12-19 Thread Yichi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yichi Zhang updated BEAM-8998:
--
Summary: Avoid excessive bundle progress polling in Dataflow Runner  (was: 
Avoid excessive bundle progress polling in JRH)

> Avoid excessive bundle progress polling in Dataflow Runner
> --
>
> Key: BEAM-8998
> URL: https://issues.apache.org/jira/browse/BEAM-8998
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Yichi Zhang
>Priority: Major
>
> Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
> SDK Harness, and use the result to decide whether data delivery should be 
> throttled. This can potentially overload SDK Harness. 
> We should try to come up with a way to avoid the throttling and lower the 
> bundle progress request frequency significantly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8989) Backwards incompatible change in ParDo.getSideInputs (caught by failure when running Apache Nemo quickstart)

2019-12-19 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-8989:
---

Assignee: (was: Udi Meiri)

> Backwards incompatible change in ParDo.getSideInputs (caught by failure when 
> running Apache Nemo quickstart)
> 
>
> Key: BEAM-8989
> URL: https://issues.apache.org/jira/browse/BEAM-8989
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.16.0, 2.17.0
>Reporter: Luke Cwik
>Priority: Critical
> Fix For: 2.18.0
>
>
> [PR/9275|https://github.com/apache/beam/pull/9275] changed 
> *ParDo.getSideInputs* from *List* to *Map PCollectionView>* which is backwards incompatible change and was released as 
> part of Beam 2.16.0 erroneously.
> Running the Apache Nemo Quickstart fails with:
>  
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: Translator private 
> static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27Exception in thread 
> "main" java.lang.RuntimeException: Translator private static void 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(org.apache.nemo.compiler.frontend.beam.PipelineTranslationContext,org.apache.beam.sdk.runners.TransformHierarchy$Node,org.apache.beam.sdk.transforms.ParDo$MultiOutput)
>  have failed to translate 
> org.apache.beam.examples.WordCount$ExtractWordsFn@600b9d27 at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:113)
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineVisitor.visitPrimitiveTransform(PipelineVisitor.java:46)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317)
>  at 
> org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251)
>  at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:80) at 
> org.apache.nemo.compiler.frontend.beam.NemoRunner.run(NemoRunner.java:31) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) at 
> org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) at 
> org.apache.beam.examples.WordCount.runWordCount(WordCount.java:185) at 
> org.apache.beam.examples.WordCount.main(WordCount.java:192)Caused by: 
> java.lang.reflect.InvocationTargetException at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.translatePrimitive(PipelineTranslator.java:109)
>  ... 14 moreCaused by: java.lang.NoSuchMethodError: 
> org.apache.beam.sdk.transforms.ParDo$MultiOutput.getSideInputs()Ljava/util/List;
>  at 
> org.apache.nemo.compiler.frontend.beam.PipelineTranslator.parDoMultiOutputTranslator(PipelineTranslator.java:236)
>  ... 19 more{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8998) Avoid excessive bundle progress polling in JRH

2019-12-19 Thread Yichi Zhang (Jira)
Yichi Zhang created BEAM-8998:
-

 Summary: Avoid excessive bundle progress polling in JRH
 Key: BEAM-8998
 URL: https://issues.apache.org/jira/browse/BEAM-8998
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Yichi Zhang


Dataflow Java runner uses 0.1 secs interval for polling bundle progress from 
SDK Harness, and use the result to decide whether data delivery should be 
throttled. This can potentially overload SDK Harness. 

We should try to come up with a way to avoid the throttling and lower the 
bundle progress request frequency significantly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8997) Include Dataflow Job ID in console output

2019-12-19 Thread Kamil Wasilewski (Jira)
Kamil Wasilewski created BEAM-8997:
--

 Summary: Include Dataflow Job ID in console output
 Key: BEAM-8997
 URL: https://issues.apache.org/jira/browse/BEAM-8997
 Project: Beam
  Issue Type: Sub-task
  Components: testing
Reporter: Kamil Wasilewski
 Fix For: Not applicable


We should include Dataflow Job ID in console output to make investigating 
individual jobs that were launched as a part of the specific load test suite 
possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8938) Tests end up leaving stale Dataflow jobs in apache-beam-testing project and exhaust GCP resources

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-8938:
---

Assignee: Kamil Wasilewski

> Tests end up leaving stale Dataflow jobs in apache-beam-testing project and 
> exhaust GCP resources
> -
>
> Key: BEAM-8938
> URL: https://issues.apache.org/jira/browse/BEAM-8938
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Assignee: Kamil Wasilewski
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some tests (I'm not sure if this is the exhaustive list but they seem to 
> appear in the dataflow console repeatedly) that seem to not be killed and eat 
> our resources: 
>   - 
> [test_reshuffle_preserves_timestamps|https://github.com/apache/beam/blob/719b8cc5e51dcd3e98425ecae5ec246657d46eca/sdks/python/apache_beam/transforms/util_test.py#L487]
>  (spotted multiple times in the dataflow console) (Python SDK)
>   - 
> [test_flatten_same_pcollections|https://github.com/apache/beam/blob/44d456830442e5f13b7fd3bd684695e2b69e2c0d/sdks/python/apache_beam/transforms/ptransform_test.py#L596]
>  (Python SDK)
>   - 
> [testPairWithIndexWindowedTimestampedBounded|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SplittableDoFnTest.java#L158]
>  (Java SDK)
>  -  
> [testPairWithIndexBasicBounded|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SplittableDoFnTest.java#L125]
>  (Java SDK)
>   
>  Temporary solution is to ignore them. Real solution requires greater 
> investigation.
> Please see the devlist thread for more context: 
> [https://lists.apache.org/thread.html/01eb33ae9c05d12bb0698f91adc0021662fdfe2978cfdfde28dc56b2%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8938) Tests end up leaving stale Dataflow jobs in apache-beam-testing project and exhaust GCP resources

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000161#comment-17000161
 ] 

Lukasz Gajowy commented on BEAM-8938:
-

[~kamilwu] I assigned you because this requires some care. Feel free to 
unassign/reasign 

> Tests end up leaving stale Dataflow jobs in apache-beam-testing project and 
> exhaust GCP resources
> -
>
> Key: BEAM-8938
> URL: https://issues.apache.org/jira/browse/BEAM-8938
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Assignee: Kamil Wasilewski
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Some tests (I'm not sure if this is the exhaustive list but they seem to 
> appear in the dataflow console repeatedly) that seem to not be killed and eat 
> our resources: 
>   - 
> [test_reshuffle_preserves_timestamps|https://github.com/apache/beam/blob/719b8cc5e51dcd3e98425ecae5ec246657d46eca/sdks/python/apache_beam/transforms/util_test.py#L487]
>  (spotted multiple times in the dataflow console) (Python SDK)
>   - 
> [test_flatten_same_pcollections|https://github.com/apache/beam/blob/44d456830442e5f13b7fd3bd684695e2b69e2c0d/sdks/python/apache_beam/transforms/ptransform_test.py#L596]
>  (Python SDK)
>   - 
> [testPairWithIndexWindowedTimestampedBounded|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SplittableDoFnTest.java#L158]
>  (Java SDK)
>  -  
> [testPairWithIndexBasicBounded|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/SplittableDoFnTest.java#L125]
>  (Java SDK)
>   
>  Temporary solution is to ignore them. Real solution requires greater 
> investigation.
> Please see the devlist thread for more context: 
> [https://lists.apache.org/thread.html/01eb33ae9c05d12bb0698f91adc0021662fdfe2978cfdfde28dc56b2%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8424) Java Dataflow ValidatesRunner tests are timeouting

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000158#comment-17000158
 ] 

Lukasz Gajowy commented on BEAM-8424:
-

There are still problem with Java11 tests (timeouting)

> Java Dataflow ValidatesRunner tests are timeouting
> --
>
> Key: BEAM-8424
> URL: https://issues.apache.org/jira/browse/BEAM-8424
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Lukasz Gajowy
>Assignee: Michal Walenia
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Java11_ValidatesRunner_Dataflow/]
> [https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Java11_ValidatesRunner_PortabilityApi_Dataflow/]
> these jobs take more than currently set timeout (3h). 
>  
> EDIT: currently, after reopening the issue the timeout is set to 4.5h. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8424) Java Dataflow ValidatesRunner tests are timeouting

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-8424:
---

Assignee: Michal Walenia

> Java Dataflow ValidatesRunner tests are timeouting
> --
>
> Key: BEAM-8424
> URL: https://issues.apache.org/jira/browse/BEAM-8424
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Lukasz Gajowy
>Assignee: Michal Walenia
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Java11_ValidatesRunner_Dataflow/]
> [https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Java11_ValidatesRunner_PortabilityApi_Dataflow/]
> these jobs take more than currently set timeout (3h). 
>  
> EDIT: currently, after reopening the issue the timeout is set to 4.5h. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8424) Java Dataflow ValidatesRunner tests are timeouting

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000158#comment-17000158
 ] 

Lukasz Gajowy edited comment on BEAM-8424 at 12/19/19 4:00 PM:
---

There are still problems with Java11 tests (timeouting)


was (Author: łukaszg):
There are still problem with Java11 tests (timeouting)

> Java Dataflow ValidatesRunner tests are timeouting
> --
>
> Key: BEAM-8424
> URL: https://issues.apache.org/jira/browse/BEAM-8424
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Lukasz Gajowy
>Assignee: Michal Walenia
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Java11_ValidatesRunner_Dataflow/]
> [https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PostCommit_Java11_ValidatesRunner_PortabilityApi_Dataflow/]
> these jobs take more than currently set timeout (3h). 
>  
> EDIT: currently, after reopening the issue the timeout is set to 4.5h. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-7368) Run Python GBK load tests on portable Flink runner

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy closed BEAM-7368.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> Run Python GBK load tests on portable Flink runner
> --
>
> Key: BEAM-7368
> URL: https://issues.apache.org/jira/browse/BEAM-7368
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7245) Encapsulate supplier, monitor and metric naming logic in some common TestMetric type

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000153#comment-17000153
 ] 

Lukasz Gajowy commented on BEAM-7245:
-

[~mwalenia] is this ticket finished?

> Encapsulate supplier, monitor and metric naming logic in some common 
> TestMetric type 
> -
>
> Key: BEAM-7245
> URL: https://issues.apache.org/jira/browse/BEAM-7245
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Minor
>
> After an offline discussion together with @mwalenia we decided to create 
> concrete classes for each metric type (Item_count, byte_count, time). Each 
> class like this will contain: 
> - metric name
> - supplier for the metric 
> - monitor for the metric
> It turns out that all this (along with the monitor/supplier can be 
> encapsulated and then attached to the pipeline/metrics reading where needed. 
> This will also encapsulate the naming logic (so that there are no typos 
> again).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-7245) Encapsulate supplier, monitor and metric naming logic in some common TestMetric type

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000153#comment-17000153
 ] 

Lukasz Gajowy edited comment on BEAM-7245 at 12/19/19 3:58 PM:
---

[~mwalenia] should we close this ticket?


was (Author: łukaszg):
[~mwalenia] is this ticket finished?

> Encapsulate supplier, monitor and metric naming logic in some common 
> TestMetric type 
> -
>
> Key: BEAM-7245
> URL: https://issues.apache.org/jira/browse/BEAM-7245
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Minor
>
> After an offline discussion together with @mwalenia we decided to create 
> concrete classes for each metric type (Item_count, byte_count, time). Each 
> class like this will contain: 
> - metric name
> - supplier for the metric 
> - monitor for the metric
> It turns out that all this (along with the monitor/supplier can be 
> encapsulated and then attached to the pipeline/metrics reading where needed. 
> This will also encapsulate the naming logic (so that there are no typos 
> again).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7115) TFRecordIOIT write_time metrics are allways 0.0

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000151#comment-17000151
 ] 

Lukasz Gajowy commented on BEAM-7115:
-

[~pawel.pasterz] do you think you could take a look?

> TFRecordIOIT write_time metrics are allways 0.0
> ---
>
> Key: BEAM-7115
> URL: https://issues.apache.org/jira/browse/BEAM-7115
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
>
> Is it because the test is so small? Or the metric is not collected well?
> This is visible in the dashboards: 
> [https://apache-beam-testing.appspot.com/explore?dashboard=5755685136498688] 
> (look for TFRecordIOIT widget)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6969) Provide way to collect start/end read/write time inside the IOs

2019-12-19 Thread Lukasz Gajowy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000150#comment-17000150
 ] 

Lukasz Gajowy commented on BEAM-6969:
-

[~pawel.pasterz] [~mwalenia] you might find this (historical) ticket 
interesting... :)

> Provide way to collect start/end read/write time inside the IOs
> ---
>
> Key: BEAM-6969
> URL: https://issues.apache.org/jira/browse/BEAM-6969
> Project: Beam
>  Issue Type: Wish
>  Components: io-ideas, testing
>Reporter: Lukasz Gajowy
>Priority: Minor
>
> Currently, IO tests measure time using Metrics API but collect start/end time 
> from ParDo transforms that are adjacent to the IO. It's fine for some tests 
> but maybe could be done better. The drawback of the current solution is that 
> we cannot collect time before PBegin and after PDone. Other than that the 
> time we collect now is still not the exact time of read/write start/end but 
> only the time at which first/last record appeared in the DoFn.
> See: 
> [TimeMonitor.java|https://github.com/apache/beam/blob/957b7cc7746aa626d2eb4dea341f668ec19d5d39/sdks/java/testing/test-utils/src/main/java/org/apache/beam/sdk/testutils/metrics/TimeMonitor.java]
>  as an example of such DoFn.
> Possible solution: save metrics in startBundle / finishBundle method in IOs 
> whenever a dedicated pipelineOption is set to true. 
> In general, maybe it's a good idea to place some other metrics inside IOs 
> too? wdyt?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-6448) Enable SDK log messages in test log output

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy resolved BEAM-6448.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> Enable SDK log messages in test log output 
> ---
>
> Key: BEAM-6448
> URL: https://issues.apache.org/jira/browse/BEAM-6448
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
> Fix For: Not applicable
>
>
> Dataflow test logs on Jenkins do not contain SDK log messages. Enabling them 
> would make debugging easier.   
> See this 
> [suggestion|https://github.com/apache/beam/pull/7497#pullrequestreview-192245735]
>  for reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-6408) beam_Java_LoadTests_GroupByKey_Dataflow_Small timeouts

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy closed BEAM-6408.
---
Fix Version/s: Not applicable
   Resolution: Resolved

> beam_Java_LoadTests_GroupByKey_Dataflow_Small timeouts
> --
>
> Key: BEAM-6408
> URL: https://issues.apache.org/jira/browse/BEAM-6408
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This job starts a load test that lasts 4 hours on Dataflow (on 10 workers). 
> It fails due to jenkins timeout. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-6349) Exceptions (IllegalArgumentException or NoClassDefFoundError) when running tests on Dataflow runner

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy closed BEAM-6349.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> Exceptions (IllegalArgumentException or NoClassDefFoundError) when running 
> tests on Dataflow runner
> ---
>
> Key: BEAM-6349
> URL: https://issues.apache.org/jira/browse/BEAM-6349
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Lukasz Gajowy
>Assignee: Craig Chambers
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Running GroupByKeyLoadTest results in the following error on Dataflow runner:
>  
> {code:java}
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$2.typedApply(IntrinsicMapTaskExecutorFactory.java:344)
>   at 
> org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$2.typedApply(IntrinsicMapTaskExecutorFactory.java:338)
>   at 
> org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
>   at 
> org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
>   at 
> org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
>   at 
> org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:120)
>   at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:337)
>   at 
> org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:291)
>   at 
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135)
>   at 
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115)
>   at 
> org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.IllegalArgumentException: Multiple entries with same 
> key: 
> kind:varint=org.apache.beam.runners.dataflow.util.CloudObjectTranslators$8@39b69c48
>  and 
> kind:varint=org.apache.beam.runners.dataflow.worker.RunnerHarnessCoderCloudObjectTranslatorRegistrar$1@7966f294
>   at 
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:136)
>   at 
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:100)
>   at 
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:86)
>   at 
> org.apache.beam.repackaged.beam_runners_google_cloud_dataflow_java.com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:300)
>   at 
> org.apache.beam.runners.dataflow.util.CloudObjects.populateCloudObjectTranslators(CloudObjects.java:60)
>   at 
> org.apache.beam.runners.dataflow.util.CloudObjects.(CloudObjects.java:39)
>   ... 15 more
> {code}
>  
> Example command to run the tests (FWIW, it also runs the  "clean" task 
> although I don't know if it's necessary):
> {code:java}
> ./gradlew clean :beam-sdks-java-load-tests:run --info 
> -PloadTest.mainClass=org.apache.beam.sdk.loadtests.GroupByKeyLoadTest 
> -Prunner=:beam-runners-google-cloud-dataflow-java 
> '-PloadTest.args=--sourceOptions={"numRecords":1000,"splitPointFrequencyRecords":1,"keySizeBytes":1,"valueSizeBytes":9,"numHotKeys":0,"hotKeyFraction":0,"seed":123456,"bundleSizeDistribution":{"type":"const","const":42},"forceNumInitialBundles":100,"progressShape":"LINEAR","initializeDelayDistribution":{"type":"const","const":42}}
>  
> --stepOptions={"outputRecordsPerInputRecord":1,"preservesInputKeyDistribution":true,"perBundleDelay":1,"perBundleDelayType":"MIXED","cpuUtilizationInMixedDelay":0.5}
>  --fanout=1 --iterations=1 --runner=DataflowRunner'{code}
>  
> After reverting commit bac909b8e237ef8a2ab7e17ac986e5cc90143e5b ([PR: 
> 7351|https://github.com/apache/beam/pull/7351]) I can no longer reproduce 
> this issue.



--
This message was sent by Atlassia

[jira] [Closed] (BEAM-4367) Phrase-triggered job (IOIT) never stopped running.

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy closed BEAM-4367.
---
Fix Version/s: Not applicable
   Resolution: Cannot Reproduce

> Phrase-triggered job (IOIT) never stopped running.
> --
>
> Key: BEAM-4367
> URL: https://issues.apache.org/jira/browse/BEAM-4367
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
> Fix For: Not applicable
>
>
> +Steps to reproduce:+ 
> 1. Define a new Jenkins' job and allow "Phrase triggering" it and timmer 
> triggering (cron job).
> 2. Type "Run seed job" in a pull request comment to trigger Jenkins Seed job 
> on the PR.
> 3. Type the phrase to trigger the job. (eg. "Run Java ParquetIO Performance 
> Test")
> 4. Run the seed job again (from master branch).
> +Expected Result:
> +The job gets triggered only few times (once would be best). It never 
> triggers from cron after the seed job from the master branch is run because 
> there's no job definition on master.
> +Actual result:+
> The job never stops triggering even though seed job from the master branch 
> was run many times.
> This happened while developing ParquetIO. The ParquetIO IT kept running for 
> 15 times more (as the time of writing this issue) even though seed job from 
> master should have canceled it. Is it due to the fact that the seed job (from 
> step 2) failed?
> +See:
> +[https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_ParquetIOIT/]
>  
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob/] 
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob/1725/console] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-3561) Provide kubernetes cluster instance for IOITs.

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy closed BEAM-3561.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> Provide kubernetes cluster instance for IOITs.
> --
>
> Key: BEAM-3561
> URL: https://issues.apache.org/jira/browse/BEAM-3561
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
> Fix For: Not applicable
>
>
> Performance tests that require running Kubernetes scripts currently cannot be 
> run on Jenkins. This is due to the fact that there is no dedicated kubernetes 
> cluster for them so Jenkins jobs cannot setup the needed infrastructure 
> anywhere.
> To allow running such tests we should provide an instance of kubernetes 
> cluster (for example a cluster hosted on GKE) and all necessary credentials 
> to connect with it from Jenkins executors (proper kubeconfig file on all 
> Jenkins executors). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-4041) Performance tests fail due to kubernetes load balancer problems

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy resolved BEAM-4041.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> Performance tests fail due to kubernetes load balancer problems
> ---
>
> Key: BEAM-4041
> URL: https://issues.apache.org/jira/browse/BEAM-4041
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Recently, as we added more IOITs to be run on jenkins using kubernetes, some 
> of them started to fail randomly, because they couldn't retrieve LoadBalancer 
> address. Normally obtaining the address took about one minute. Perfkit waits 
> for the address (actively checking for it) for 3 minutes. This should be 
> enough for getting the address, yet it recently started to exceed the 3 
> minutes limit. I also noticed that this error didn't happen when there were 
> fewer tests.
> Example logs:
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PerformanceTests_Compressed_TextIOIT_HDFS/31/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8940) Load dataflow jobs using java 11 in Java 11 Dataflow tests

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-8940:
---

Assignee: Michal Walenia

> Load dataflow jobs using java 11 in Java 11 Dataflow tests
> --
>
> Key: BEAM-8940
> URL: https://issues.apache.org/jira/browse/BEAM-8940
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Assignee: Michal Walenia
>Priority: Major
>
> Currently, Java 11 tests use only java11 docker worker image for verifying 
> Java 11 compatibility. Everything else (artifact staging and job startup) is 
> done using Java 8. It should be done with java11 as well - this is how users 
> will do it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-4420) Add KafkaIO Integration Tests

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-4420:
---

Assignee: (was: Lukasz Gajowy)

> Add KafkaIO Integration Tests
> -
>
> Key: BEAM-4420
> URL: https://issues.apache.org/jira/browse/BEAM-4420
> Project: Beam
>  Issue Type: Test
>  Components: io-java-kafka, testing
>Reporter: Ismaël Mejía
>Priority: Minor
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> It is a good idea to have ITs for KafkaIO.
> There are two possible issues:
> 1. The tests should probably invert the pattern to be readThenWrite given 
> that Unbounded IOs block on Read and ...
> 2. Until we have a way to do PAsserts on Unbounded sources we can rely on 
> withMaxNumRecords to ensure this test ends.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8319) Errorprone 0.0.13 fails during JDK11 build

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-8319:
---

Assignee: (was: Lukasz Gajowy)

> Errorprone 0.0.13 fails during JDK11 build
> --
>
> Key: BEAM-8319
> URL: https://issues.apache.org/jira/browse/BEAM-8319
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Lukasz Gajowy
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I'm using openjdk 1.11.02. After switching version to;
> {code:java}
> javaVersion = 11 {code}
> in BeamModule Plugin and running
> {code:java}
> ./gradlew clean build -p sdks/java/code -xtest {code}
> building fails. I was able to run errorprone after upgrading it but had 
> problems with conflicting guava version. See more here: 
> https://issues.apache.org/jira/browse/BEAM-5085
>  
> Stacktrace:
> {code:java}
> org.gradle.api.tasks.TaskExecutionException: Execution failed for task 
> ':model:pipeline:compileJava'.
> at 
> org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$2.accept(ExecuteActionsTaskExecuter.java:121)
> at 
> org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$2.accept(ExecuteActionsTaskExecuter.java:117)
> at org.gradle.internal.Try$Failure.ifSuccessfulOrElse(Try.java:184)
> at 
> org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.execute(ExecuteActionsTaskExecuter.java:110)
> at 
> org.gradle.api.internal.tasks.execution.ResolveIncrementalChangesTaskExecuter.execute(ResolveIncrementalChangesTaskExecuter.java:84)
> at 
> org.gradle.api.internal.tasks.execution.ResolveTaskOutputCachingStateExecuter.execute(ResolveTaskOutputCachingStateExecuter.java:91)
> at 
> org.gradle.api.internal.tasks.execution.FinishSnapshotTaskInputsBuildOperationTaskExecuter.execute(FinishSnapshotTaskInputsBuildOperationTaskExecuter.java:51)
> at 
> org.gradle.api.internal.tasks.execution.ResolveBuildCacheKeyExecuter.execute(ResolveBuildCacheKeyExecuter.java:102)
> at 
> org.gradle.api.internal.tasks.execution.ResolveBeforeExecutionStateTaskExecuter.execute(ResolveBeforeExecutionStateTaskExecuter.java:74)
> at 
> org.gradle.api.internal.tasks.execution.ValidatingTaskExecuter.execute(ValidatingTaskExecuter.java:58)
> at 
> org.gradle.api.internal.tasks.execution.SkipEmptySourceFilesTaskExecuter.execute(SkipEmptySourceFilesTaskExecuter.java:109)
> at 
> org.gradle.api.internal.tasks.execution.ResolveBeforeExecutionOutputsTaskExecuter.execute(ResolveBeforeExecutionOutputsTaskExecuter.java:67)
> at 
> org.gradle.api.internal.tasks.execution.StartSnapshotTaskInputsBuildOperationTaskExecuter.execute(StartSnapshotTaskInputsBuildOperationTaskExecuter.java:52)
> at 
> org.gradle.api.internal.tasks.execution.ResolveAfterPreviousExecutionStateTaskExecuter.execute(ResolveAfterPreviousExecutionStateTaskExecuter.java:46)
> at 
> org.gradle.api.internal.tasks.execution.CleanupStaleOutputsExecuter.execute(CleanupStaleOutputsExecuter.java:93)
> at 
> org.gradle.api.internal.tasks.execution.FinalizePropertiesTaskExecuter.execute(FinalizePropertiesTaskExecuter.java:45)
> at 
> org.gradle.api.internal.tasks.execution.ResolveTaskExecutionModeExecuter.execute(ResolveTaskExecutionModeExecuter.java:94)
> at 
> org.gradle.api.internal.tasks.execution.SkipTaskWithNoActionsExecuter.execute(SkipTaskWithNoActionsExecuter.java:57)
> at 
> org.gradle.api.internal.tasks.execution.SkipOnlyIfTaskExecuter.execute(SkipOnlyIfTaskExecuter.java:56)
> at 
> org.gradle.api.internal.tasks.execution.CatchExceptionTaskExecuter.execute(CatchExceptionTaskExecuter.java:36)
> at 
> org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.executeTask(EventFiringTaskExecuter.java:63)
> at 
> org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.call(EventFiringTaskExecuter.java:49)
> at 
> org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.call(EventFiringTaskExecuter.java:46)
> at 
> org.gradle.internal.operations.DefaultBuildOperationExecutor$CallableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:416)
> at 
> org.gradle.internal.operations.DefaultBuildOperationExecutor$CallableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:406)
> at 
> org.gradle.internal.operations.DefaultBuildOperationExecutor$1.execute(DefaultBuildOperationExecutor.java:165)
> at 
> org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:250)
> at 
> org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:158)
> 

[jira] [Assigned] (BEAM-8940) Load dataflow jobs using java 11 in Java 11 Dataflow tests

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-8940:
---

Assignee: (was: Lukasz Gajowy)

> Load dataflow jobs using java 11 in Java 11 Dataflow tests
> --
>
> Key: BEAM-8940
> URL: https://issues.apache.org/jira/browse/BEAM-8940
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Major
>
> Currently, Java 11 tests use only java11 docker worker image for verifying 
> Java 11 compatibility. Everything else (artifact staging and job startup) is 
> done using Java 8. It should be done with java11 as well - this is how users 
> will do it. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8559) Run Dataflow Nexmark suites with Java 11

2019-12-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy reassigned BEAM-8559:
---

Assignee: Michal Walenia  (was: Lukasz Gajowy)

> Run Dataflow Nexmark suites with Java 11
> 
>
> Key: BEAM-8559
> URL: https://issues.apache.org/jira/browse/BEAM-8559
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing-nexmark
>Reporter: Lukasz Gajowy
>Assignee: Michal Walenia
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> This task is similar to https://issues.apache.org/jira/browse/BEAM-6936.
> The goal is to run Nexmark suites with Java 11 but compile with java 8 to 
> verify compatibility. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >