[jira] [Assigned] (BEAM-4042) Get rid of deprecated gradle API
[ https://issues.apache.org/jira/browse/BEAM-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-4042: -- Assignee: Romain Manni-Bucau (was: Davor Bonaci) > Get rid of deprecated gradle API > > > Key: BEAM-4042 > URL: https://issues.apache.org/jira/browse/BEAM-4042 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau >Priority: Minor > > {code} > > Task :beam-model-pipeline:shadowJar > The SimpleWorkResult type has been deprecated and is scheduled to be removed > in Gradle 5.0. Please use WorkResults.didWork() instead. > at > org.gradle.api.internal.tasks.SimpleWorkResult.(SimpleWorkResult.java:34) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.codehaus.groovy.reflection.CachedConstructor.invoke(CachedConstructor.java:83) > at > org.codehaus.groovy.runtime.callsite.ConstructorSite$ConstructorSiteNoUnwrapNoCoerce.callConstructor(ConstructorSite.java:105) > at > org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallConstructor(CallSiteArray.java:60) > at > org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:235) > at > org.codehaus.groovy.runtime.callsite.AbstractCallSite.callConstructor(AbstractCallSite.java:247) > at > com.github.jengelman.gradle.plugins.shadow.tasks.ShadowCopyAction.execute(ShadowCopyAction.groovy:99) > {code} > to ensure the build output is as expected as possible (no exception in the > build process when "green") this kind of stack should be fixed -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3852) Migrate existing users to new channel ASF slack channel
[ https://issues.apache.org/jira/browse/BEAM-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405503#comment-16405503 ] Davor Bonaci commented on BEAM-3852: It's fine to leave it assigned until the formal move has happened. We'll likely have to push folks a bit, such as locking channels or other things before everyone actually moves. > Migrate existing users to new channel ASF slack channel > --- > > Key: BEAM-3852 > URL: https://issues.apache.org/jira/browse/BEAM-3852 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Luke Cwik >Assignee: Romain Manni-Bucau >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Channel is https://the-asf.slack.com/messages/C9H0YNP3P/ > Created short link for #beam channel directly: > [https://s.apache.org/beam-slack-channel] > Created short link for self-enrollment: https://s.apache.org/slack-invite -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3852) Migrate existing users to new channel ASF slack channel
[ https://issues.apache.org/jira/browse/BEAM-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3852: -- Assignee: Romain Manni-Bucau (was: Davor Bonaci) > Migrate existing users to new channel ASF slack channel > --- > > Key: BEAM-3852 > URL: https://issues.apache.org/jira/browse/BEAM-3852 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Luke Cwik >Assignee: Romain Manni-Bucau >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > Channel is https://the-asf.slack.com/messages/C9H0YNP3P/ > Created short link for #beam channel directly: > [https://s.apache.org/beam-slack-channel] > Created short link for self-enrollment: https://s.apache.org/slack-invite -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-2339) Jenkins cross JDK version test on Windows
[ https://issues.apache.org/jira/browse/BEAM-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2339: -- Assignee: (was: Davor Bonaci) > Jenkins cross JDK version test on Windows > - > > Key: BEAM-2339 > URL: https://issues.apache.org/jira/browse/BEAM-2339 > Project: Beam > Issue Type: Task > Components: build-system, testing >Reporter: Mark Liu >Priority: Major > > We can set os variant to choose windows for Jenkins test, which can be > combined with JDK version test. So that we can have cross OS / cross JDK > version test. > This discussion came from > https://github.com/apache/beam/pull/3184#pullrequestreview-39303400 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3852) Migrate existing users to new channel ASF slack channel
[ https://issues.apache.org/jira/browse/BEAM-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3852: -- Assignee: Romain Manni-Bucau (was: Davor Bonaci) > Migrate existing users to new channel ASF slack channel > --- > > Key: BEAM-3852 > URL: https://issues.apache.org/jira/browse/BEAM-3852 > Project: Beam > Issue Type: New Feature > Components: website >Reporter: Luke Cwik >Assignee: Romain Manni-Bucau >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Channel is https://the-asf.slack.com/messages/C9H0YNP3P/ > Created short link for #beam channel directly: > [https://s.apache.org/beam-slack-channel] > Created short link for self-enrollment: https://s.apache.org/slack-invite -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3845) Avoid calling Class#newInstance
[ https://issues.apache.org/jira/browse/BEAM-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3845: -- Assignee: Ted Yu (was: Davor Bonaci) > Avoid calling Class#newInstance > --- > > Key: BEAM-3845 > URL: https://issues.apache.org/jira/browse/BEAM-3845 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Ted Yu >Assignee: Ted Yu >Priority: Minor > > Class#newInstance is deprecated starting in Java 9 - > https://bugs.openjdk.java.net/browse/JDK-6850612 - because it may throw > undeclared checked exceptions. > The suggested replacement is getDeclaredConstructor().newInstance(), which > wraps the checked exceptions in InvocationException. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3845) Avoid calling Class#newInstance
[ https://issues.apache.org/jira/browse/BEAM-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci updated BEAM-3845: --- Component/s: (was: project-management) sdk-java-core > Avoid calling Class#newInstance > --- > > Key: BEAM-3845 > URL: https://issues.apache.org/jira/browse/BEAM-3845 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Ted Yu >Assignee: Davor Bonaci >Priority: Minor > > Class#newInstance is deprecated starting in Java 9 - > https://bugs.openjdk.java.net/browse/JDK-6850612 - because it may throw > undeclared checked exceptions. > The suggested replacement is getDeclaredConstructor().newInstance(), which > wraps the checked exceptions in InvocationException. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3791) build_rules.gradle version (2.4.0-SNAPSHOT) should match pom.xml (2.5.0-SNAPSHOT)
[ https://issues.apache.org/jira/browse/BEAM-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3791: -- Assignee: Alan Myrvold (was: Davor Bonaci) > build_rules.gradle version (2.4.0-SNAPSHOT) should match pom.xml > (2.5.0-SNAPSHOT) > - > > Key: BEAM-3791 > URL: https://issues.apache.org/jira/browse/BEAM-3791 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Alan Myrvold >Assignee: Alan Myrvold >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > build_rules.gradle has: > version = "2.4.0-SNAPSHOT" > > it should be 2.5.0-SNAPSHOT -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-1754) Will Dataflow ever support Node.js with an SDK similar to Java or Python?
[ https://issues.apache.org/jira/browse/BEAM-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16379755#comment-16379755 ] Davor Bonaci commented on BEAM-1754: You are more than welcome to do so! I'd suggest using dev@ mailing list for any questions or help that you might need. Unfortunately, we don't have a guide for writing SDKs, but the folks on the mailing list will be very happy to help. One of the first decisions you'll have to make is how to interface with runners and whether to use the new portability framework API. This is also something best discussed on the mailing list. In terms of an example, Python SDK and Go SDK would be great to follow. Good luck! > Will Dataflow ever support Node.js with an SDK similar to Java or Python? > - > > Key: BEAM-1754 > URL: https://issues.apache.org/jira/browse/BEAM-1754 > Project: Beam > Issue Type: New Feature > Components: sdk-ideas >Reporter: Diego Zuluaga >Priority: Critical > Labels: node.js > > I like the philosophy behind DataFlow and found the Java and Python samples > highly comprehensible. However, I have to admit that for most Node.js > developers who have little background on typed languages and are used to get > up to speed with frameworks incredibly fast, learning Dataflow might take > some learning curve that they/we're not used to. So, I wonder if at any point > in time Dataflow will provide a Node.js SDK. Maybe this is out of the > question, but I wanted to run it by the team as it would be awesome to have > something along these lines! > Thanks, > Diego > Question originaly posted in SO: > http://stackoverflow.com/questions/42893436/will-dataflow-ever-support-node-js-with-and-sdk-similar-to-java-or-python -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3697) Add errorprone to maven and gradle builds
[ https://issues.apache.org/jira/browse/BEAM-3697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3697: -- Assignee: (was: Davor Bonaci) > Add errorprone to maven and gradle builds > - > > Key: BEAM-3697 > URL: https://issues.apache.org/jira/browse/BEAM-3697 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Eugene Kirpichov >Priority: Major > > [http://errorprone.info/] is a good static checker that covers a number of > bugs not covered by FindBugs or Checkstyle. We use it internally at Google > and, when run on the Beam codebase, it occasionally uncovers issues missed > during PR review process. > > It has Maven and Gradle plugins: > [http://errorprone.info/docs/installation] > [https://github.com/tbroyer/gradle-errorprone-plugin] > > It would be good to integrate it into our Maven and Gradle builds. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3266) MergeBot bug when regenerating website
[ https://issues.apache.org/jira/browse/BEAM-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3266: -- Assignee: Jason Kuster (was: Davor Bonaci) > MergeBot bug when regenerating website > -- > > Key: BEAM-3266 > URL: https://issues.apache.org/jira/browse/BEAM-3266 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Etienne Chauchot >Assignee: Jason Kuster > > Mergebot seems to badly regenerate website when a page has moved. For example > see mergebot commit 446586c68c1d244d240fe18ee48e69aba4462949 The page > documentation/sdk/nexmark/index.html (old url) was deleted but the page > documentation/sdk/java/nexmark/index.html (new url) was not added leading to > a http 404. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-998) Consider asking Apache to register Apache Beam trademark
[ https://issues.apache.org/jira/browse/BEAM-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255499#comment-16255499 ] Davor Bonaci commented on BEAM-998: --- US application submitted. > Consider asking Apache to register Apache Beam trademark > > > Key: BEAM-998 > URL: https://issues.apache.org/jira/browse/BEAM-998 > Project: Beam > Issue Type: Task > Components: project-management >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Davor Bonaci > > "Registered Trademarks If a PMC would like to request legal registration of > their project's trademarks, please registering their marks, please follow the > REGREQUEST instructions." > http://www.apache.org/foundation/marks/pmcs#other > The link to REGREQUEST: > http://www.apache.org/foundation/marks/register#register -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-3185) Build blocks on parsing long as int from github status json
[ https://issues.apache.org/jira/browse/BEAM-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3185: -- Assignee: Jason Kuster (was: Davor Bonaci) > Build blocks on parsing long as int from github status json > --- > > Key: BEAM-3185 > URL: https://issues.apache.org/jira/browse/BEAM-3185 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: holdenk >Assignee: Jason Kuster >Priority: Blocker > > (e.g. see > https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/818/console ) > `Caused by: com.fasterxml.jackson.databind.JsonMappingException: Numeric > value (4313677368) out of range of int` > Assuming IDs are monotonically increasing this might impact all new PRs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-3164) Capture stderr logs during gen proto
[ https://issues.apache.org/jira/browse/BEAM-3164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3164: -- Assignee: Holden Karau (was: Davor Bonaci) > Capture stderr logs during gen proto > > > Key: BEAM-3164 > URL: https://issues.apache.org/jira/browse/BEAM-3164 > Project: Beam > Issue Type: Bug > Components: build-system, sdk-py-core >Reporter: holdenk >Assignee: Holden Karau > > Currently python PRs are failing with gen-proto failures, but these are > difficult to debug because we don't capture the information (see > https://builds.apache.org/job/beam_PreCommit_Python_MavenInstall/727/console > ). > cc [~altay] [~robertwb] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-3106) Consider not pinning all python dependencies, or moving them to requirements.txt
[ https://issues.apache.org/jira/browse/BEAM-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3106: -- Assignee: Ahmet Altay (was: Davor Bonaci) > Consider not pinning all python dependencies, or moving them to > requirements.txt > > > Key: BEAM-3106 > URL: https://issues.apache.org/jira/browse/BEAM-3106 > Project: Beam > Issue Type: Wish > Components: build-system >Affects Versions: 2.1.0 > Environment: python >Reporter: Maximilian Roos >Assignee: Ahmet Altay > > Currently all python dependencies are [pinned or > capped|https://github.com/apache/beam/blob/master/sdks/python/setup.py#L97] > While there's a good argument for supplying a `requirements.txt` with well > tested dependencies, having them specified in `setup.py` forces them to an > exact state on each install of Beam. This makes using Beam in any environment > with other libraries nigh on impossible. > This is particularly severe for the `gcp` dependencies, where we have > libraries that won't work with an older version (but Beam _does_ work with an > newer version). We have to do a bunch of gymnastics to get the correct > versions installed because of this. Unfortunately, airflow repeats this > practice and conflicts on a number of dependencies, adding further > complication (but, again there is no real conflict). > I haven't seen this practice outside of the Apache & Google ecosystem - for > example no libraries in numerical python do this. Here's a [discussion on > SO|https://stackoverflow.com/questions/28509481/should-i-pin-my-python-dependencies-versions] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (BEAM-3086) build failed
[ https://issues.apache.org/jira/browse/BEAM-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci updated BEAM-3086: --- Component/s: (was: build-system) sdk-py-core > build failed > - > > Key: BEAM-3086 > URL: https://issues.apache.org/jira/browse/BEAM-3086 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Environment: $ python --version > Python 2.7.13 :: Anaconda custom (64-bit) > $ cython --version > Cython version 0.23.4 > $ pip --version > pip 9.0.1 from /home/cui/anaconda2/lib/python2.7/site-packages (python 2.7) > $ javac -version > javac 1.8.0_131 > $ uname -a > Linux hp 4.10.0-37-generic #41-Ubuntu SMP Fri Oct 6 20:20:37 UTC 2017 x86_64 > x86_64 x86_64 GNU/Linux > {code} > $ git log > commit a29e0ad61fc1c3acfc62c31820b137a2beccd313 >Reporter: Qi Cui >Assignee: Ahmet Altay > Labels: build > > Compiling apache_beam/utils/windowed_value.py because it changed. > [ 1/11] Cythonizing apache_beam/coders/coder_impl.py > [ 2/11] Cythonizing apache_beam/coders/stream.pyx > [ 3/11] Cythonizing apache_beam/metrics/execution.py > [ 4/11] Cythonizing apache_beam/runners/common.py > [ 5/11] Cythonizing apache_beam/runners/worker/logger.py > [ 6/11] Cythonizing apache_beam/runneTraceback (most recent call last): > File "setup.py", line 173, in > rs/worker/opcounters.py > [ 7/11] Cythonizing apache_beam/runners/worker/operations.py > 'apache_beam/utils/windowed_value.py', > File > "/home/cui/anaconda2/lib/python2.7/site-packages/Cython/Build/Dependencies.py", > line 877, in cythonize > cythonize_one(*args) > File > "/home/cui/anaconda2/lib/python2.7/site-packages/Cython/Build/Dependencies.py", > line 997, in cythonize_one > raise CompileError(None, pyx_file) > Cython.Compiler.Errors.CompileError: apache_beam/runners/worker/operations.py > [ERROR] Command execution failed. > org.apache.commons.exec.ExecuteException: Process exited with an error: 1 > (Exit value: 1) > at > org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404) > at > org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166) > at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:764) > at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:711) > at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289) > at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > [INFO] > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-3086) build failed
[ https://issues.apache.org/jira/browse/BEAM-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-3086: -- Assignee: Ahmet Altay (was: Davor Bonaci) > build failed > - > > Key: BEAM-3086 > URL: https://issues.apache.org/jira/browse/BEAM-3086 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Environment: $ python --version > Python 2.7.13 :: Anaconda custom (64-bit) > $ cython --version > Cython version 0.23.4 > $ pip --version > pip 9.0.1 from /home/cui/anaconda2/lib/python2.7/site-packages (python 2.7) > $ javac -version > javac 1.8.0_131 > $ uname -a > Linux hp 4.10.0-37-generic #41-Ubuntu SMP Fri Oct 6 20:20:37 UTC 2017 x86_64 > x86_64 x86_64 GNU/Linux > {code} > $ git log > commit a29e0ad61fc1c3acfc62c31820b137a2beccd313 >Reporter: Qi Cui >Assignee: Ahmet Altay > Labels: build > > Compiling apache_beam/utils/windowed_value.py because it changed. > [ 1/11] Cythonizing apache_beam/coders/coder_impl.py > [ 2/11] Cythonizing apache_beam/coders/stream.pyx > [ 3/11] Cythonizing apache_beam/metrics/execution.py > [ 4/11] Cythonizing apache_beam/runners/common.py > [ 5/11] Cythonizing apache_beam/runners/worker/logger.py > [ 6/11] Cythonizing apache_beam/runneTraceback (most recent call last): > File "setup.py", line 173, in > rs/worker/opcounters.py > [ 7/11] Cythonizing apache_beam/runners/worker/operations.py > 'apache_beam/utils/windowed_value.py', > File > "/home/cui/anaconda2/lib/python2.7/site-packages/Cython/Build/Dependencies.py", > line 877, in cythonize > cythonize_one(*args) > File > "/home/cui/anaconda2/lib/python2.7/site-packages/Cython/Build/Dependencies.py", > line 997, in cythonize_one > raise CompileError(None, pyx_file) > Cython.Compiler.Errors.CompileError: apache_beam/runners/worker/operations.py > [ERROR] Command execution failed. > org.apache.commons.exec.ExecuteException: Process exited with an error: 1 > (Exit value: 1) > at > org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404) > at > org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166) > at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:764) > at org.codehaus.mojo.exec.ExecMojo.executeCommandLine(ExecMojo.java:711) > at org.codehaus.mojo.exec.ExecMojo.execute(ExecMojo.java:289) > at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) > at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) > at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) > at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863) > at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288) > at org.apache.maven.cli.MavenCli.main(MavenCli.java:199) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289) > at > org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229) > at > org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415) > at > org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356) > [INFO] > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2802) TextIO should allow specifying a custom delimiter
[ https://issues.apache.org/jira/browse/BEAM-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183062#comment-16183062 ] Davor Bonaci commented on BEAM-2802: [~ryanskraba] and [~echauchot] -- thank you both for making this happen and pushing it through! > TextIO should allow specifying a custom delimiter > - > > Key: BEAM-2802 > URL: https://issues.apache.org/jira/browse/BEAM-2802 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions >Reporter: Etienne Chauchot >Assignee: Etienne Chauchot >Priority: Minor > Fix For: 2.2.0 > > > Currently TextIO use {{\r}} {{\n}} or {{\r\n}} or a mix of the two to split a > text file into PCollection elements. It might happen that a record is spread > across more than one line. In that case we should be able to specify a custom > record delimiter to be used in place of the default ones. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-998) Consider asking Apache to register Apache Beam trademark
[ https://issues.apache.org/jira/browse/BEAM-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140717#comment-16140717 ] Davor Bonaci commented on BEAM-998: --- Re-pinged. > Consider asking Apache to register Apache Beam trademark > > > Key: BEAM-998 > URL: https://issues.apache.org/jira/browse/BEAM-998 > Project: Beam > Issue Type: Task > Components: project-management >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Davor Bonaci > > "Registered Trademarks If a PMC would like to request legal registration of > their project's trademarks, please registering their marks, please follow the > REGREQUEST instructions." > http://www.apache.org/foundation/marks/pmcs#other > The link to REGREQUEST: > http://www.apache.org/foundation/marks/register#register -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1457) Enable rat plugin and findbugs plugin in default build
[ https://issues.apache.org/jira/browse/BEAM-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1457: -- Assignee: (was: Davor Bonaci) > Enable rat plugin and findbugs plugin in default build > -- > > Key: BEAM-1457 > URL: https://issues.apache.org/jira/browse/BEAM-1457 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Aviem Zur > > Today, maven rat plugin and findbugs plugin only run when `release` profile > is specified. > Since these plugins do not add a large amount of time compared to the normal > build, and their checks are required to pass to approve pull requests - let's > enable them by default. > [Original dev list > discussion|https://lists.apache.org/thread.html/e1f80e54b44b4a39630d978abe79fb6a6cecf71d9821ee1881b47afb@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1975) Documentation around logging in different runners
[ https://issues.apache.org/jira/browse/BEAM-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1975: -- Assignee: (was: Davor Bonaci) > Documentation around logging in different runners > - > > Key: BEAM-1975 > URL: https://issues.apache.org/jira/browse/BEAM-1975 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Aviem Zur > > Add documentation on how to configure logging in different runners, relate to > SLF4J and bindings, and which binding is used in which runner. > Add helpful links to the different logging configuration guides for the > bindings used in each runner. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1189) Add guide for release verifiers in the release guide
[ https://issues.apache.org/jira/browse/BEAM-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1189: -- Assignee: (was: Davor Bonaci) > Add guide for release verifiers in the release guide > > > Key: BEAM-1189 > URL: https://issues.apache.org/jira/browse/BEAM-1189 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Kenneth Knowles > > This came up during the 0.4.0-incubating release discussion. > There is this checklist: > http://incubator.apache.org/guides/releasemanagement.html#check-list > And we could point to that but make more detailed Beam-specific instructions > on > http://beam.apache.org/contribute/release-guide/#vote-on-the-release-candidate > And the template for the vote email should include a link to suggested > verification steps. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1974: -- Assignee: (was: Davor Bonaci) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1929) I/O Authoring Overview should discuss when to use source/Pardo/IOChannelFactory
[ https://issues.apache.org/jira/browse/BEAM-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1929: -- Assignee: Chamikara Jayalath (was: Davor Bonaci) > I/O Authoring Overview should discuss when to use > source/Pardo/IOChannelFactory > --- > > Key: BEAM-1929 > URL: https://issues.apache.org/jira/browse/BEAM-1929 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Stephen Sisk >Assignee: Chamikara Jayalath > > In a recent discussion on the mailing list [1] it came up that it's not > always clear when reading from files with various file formats what exactly > is the right way to do so. > Key quote: "To contribute a new IO connector, how can I determine whether it > should be implemented as a source transform or as a scheme for the TextIO?" > cc [~jbonofre] [~dhalp...@google.com] > [1] > https://lists.apache.org/thread.html/16188ab68e738846c1620552075ff18b90fa391a7210871e9a04778d@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1961) Standard IO metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1961: -- Assignee: (was: Davor Bonaci) > Standard IO metrics documentation > - > > Key: BEAM-1961 > URL: https://issues.apache.org/jira/browse/BEAM-1961 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Aviem Zur > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2082) I/O Authoring overview - emphasize reading the PTransform style guide
[ https://issues.apache.org/jira/browse/BEAM-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2082: -- Assignee: Chamikara Jayalath (was: Davor Bonaci) > I/O Authoring overview - emphasize reading the PTransform style guide > - > > Key: BEAM-2082 > URL: https://issues.apache.org/jira/browse/BEAM-2082 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Stephen Sisk >Assignee: Chamikara Jayalath >Priority: Minor > > currently, the I/O Authoring style guide mentions the PTransform style guide, > but I think it underplays the value - I'd like to emphasize it a bit more > (probably make it its own section or least make it the first bullet point :) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1957) Missing DoFn annotations documentation
[ https://issues.apache.org/jira/browse/BEAM-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1957: -- Assignee: Kenneth Knowles (was: Davor Bonaci) > Missing DoFn annotations documentation > -- > > Key: BEAM-1957 > URL: https://issues.apache.org/jira/browse/BEAM-1957 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Kenneth Knowles > > Not all {{DoFn}} annotations are covered by the programming guide. > Only {{@ProcessElement}} is currently covered. > We should have documentation for the other (non-expermintal at least) > annotations: > {code} > public @interface Setup > public @interface StartBundle > public @interface FinishBundle > public @interface Teardown > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2634) Add better documentation on testing unbounded I/O scenarios
[ https://issues.apache.org/jira/browse/BEAM-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2634: -- Assignee: (was: Davor Bonaci) > Add better documentation on testing unbounded I/O scenarios > --- > > Key: BEAM-2634 > URL: https://issues.apache.org/jira/browse/BEAM-2634 > Project: Beam > Issue Type: Sub-task > Components: website >Reporter: Stephen Sisk > > The currently planned unit test & integration test docs will mostly cover > bounded I/O transforms - we'll need to add documentation on testing unbounded > I/O -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2358) "/test-your-pipeline" example code results in an exception
[ https://issues.apache.org/jira/browse/BEAM-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2358: -- Assignee: Jason Kuster (was: Davor Bonaci) > "/test-your-pipeline" example code results in an exception > -- > > Key: BEAM-2358 > URL: https://issues.apache.org/jira/browse/BEAM-2358 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Nicholas Ursa >Assignee: Jason Kuster > Labels: documentation, easyfix > Original Estimate: 2h > Remaining Estimate: 2h > > https://beam.apache.org/documentation/pipelines/test-your-pipeline/ has > {code} > public void testCountWords() throws Exception { > Pipeline p = TestPipeline.create(); > {code} > but this results in > {code} > Exception in thread "main" java.lang.IllegalStateException: Is your > TestPipeline declaration missing a @Rule annotation? Usage: @Rule public > final transient TestPipeline pipeline = TestPipeline.Create(); > at > org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444) > at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:299) > at BasicPipelineTest.run(BasicPipelineTest.java:42) > at Main.main(Main.java:25) > {code} > In the [github > example|https://github.com/apache/beam/blob/master/examples/java8/src/test/java/org/apache/beam/examples/MinimalWordCountJava8Test.java#L56] > it's written as: > {code} > public TestPipeline p = > TestPipeline.create().enableAbandonedNodeEnforcement(false); > {code} > I'm using 2.0.0 from the maven repo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-686) Integration tests should create/destroy resources required for running
[ https://issues.apache.org/jira/browse/BEAM-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-686: - Assignee: (was: Davor Bonaci) > Integration tests should create/destroy resources required for running > -- > > Key: BEAM-686 > URL: https://issues.apache.org/jira/browse/BEAM-686 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Luke Cwik >Priority: Minor > > By not creating/tearing down resources in integration tests, the tests are > not portable or easily runnable by a user who doesn't have access to all the > pre-created test artifacts. > For example BigtableReadIT assumes that it is executing in a project which > has a specifically named Bigtable instance preloaded with testdata. > Also, KinesisReaderIT assumes an already created Kinesis topic exists. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-601) Enable Kinesis integration tests
[ https://issues.apache.org/jira/browse/BEAM-601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-601: - Assignee: Chamikara Jayalath (was: Davor Bonaci) > Enable Kinesis integration tests > > > Key: BEAM-601 > URL: https://issues.apache.org/jira/browse/BEAM-601 > Project: Beam > Issue Type: Improvement > Components: testing >Affects Versions: 0.3.0-incubating >Reporter: Przemyslaw Pastuszka >Assignee: Chamikara Jayalath > > There's an integration test for KinesisIO called KinesisReaderIT, but it is > currently ignored, because it needs real Kinesis instance setup. > As part of this task please: > * setup real Kinesis environment on AWS for testing purposes > * enable KinesisReaderIT test > * setup jenkins, so that it passes all KinesisTestOptions when running > integration tests > This is a follow up to BEAM-461 requested by [~dhalp...@google.com] in > https://github.com/apache/incubator-beam/pull/687/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1605) Add support for Apex cluster metrics to PerfKit Benchmarker
[ https://issues.apache.org/jira/browse/BEAM-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1605: -- Assignee: Jason Kuster (was: Davor Bonaci) > Add support for Apex cluster metrics to PerfKit Benchmarker > --- > > Key: BEAM-1605 > URL: https://issues.apache.org/jira/browse/BEAM-1605 > Project: Beam > Issue Type: Bug > Components: runner-apex, testing >Reporter: Jason Kuster >Assignee: Jason Kuster > > See > https://docs.google.com/document/d/1PsjGPSN6FuorEEPrKEP3u3m16tyOzph5FnL2DhaRDz0/edit?ts=58a78e73#heading=h.exn0s6jsm24q > for more details on what this entails. > Blocked on BEAM-1599, adding support for Apex to PKB -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-910) Always explicitly specify --output for ITs
[ https://issues.apache.org/jira/browse/BEAM-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-910: - Assignee: (was: Davor Bonaci) > Always explicitly specify --output for ITs > -- > > Key: BEAM-910 > URL: https://issues.apache.org/jira/browse/BEAM-910 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Kenneth Knowles > > With a revamp of the examples in > [#1275|https://github.com/apache/incubator-beam/pull/1275] the {{--output}} > parameter becomes mandatory rather than defaulting to the temp location (this > was weird anyhow). > The ITs all fail because they leave this value implicit, as in > https://builds.apache.org/job/beam_PreCommit_MavenVerify/4635/org.apache.beam$beam-examples-java/testReport/junit/org.apache.beam.examples/WindowedWordCountIT/testWindowedWordCountInBatch/ > I believe this requires Jenkins access. I could be wrong. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2650) migrate beam_it_args -> beam_it_options
[ https://issues.apache.org/jira/browse/BEAM-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2650: -- Assignee: Chamikara Jayalath > migrate beam_it_args -> beam_it_options > --- > > Key: BEAM-2650 > URL: https://issues.apache.org/jira/browse/BEAM-2650 > Project: Beam > Issue Type: Task > Components: testing >Reporter: Stephen Sisk >Assignee: Chamikara Jayalath > > When adding the mvn -> pkb -> mvn integration for the IO IT's usage of PKB, I > noticed that beam_it_args had two problems: > 1) args had a format for passing options that was sub-optimal > 2) the thing we are working with is options, not args, so it's mis-named. > It was important to solve #1, but you can't just change the name since it'd > break with the currently checked in jenkins job. So I needed to migrate away > from it, and #2 presented an easy opportunity to do so, so I added > beam_it_options as the new option. > We should remove usages of beam_it_args and migrate over to only > beam_it_options, then remove beam_it_args from pkb -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1654) Tests that UnboundedSources are executed correctly
[ https://issues.apache.org/jira/browse/BEAM-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1654: -- Assignee: (was: Davor Bonaci) > Tests that UnboundedSources are executed correctly > -- > > Key: BEAM-1654 > URL: https://issues.apache.org/jira/browse/BEAM-1654 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Ben Chambers > > Specifically, develop a set of RunnableOnService tests that validate runner > behavior when executing an Unbounded Source. Validations should include > behaviors such as finalizeCheckpoint being called at most once, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2650) migrate beam_it_args -> beam_it_options
[ https://issues.apache.org/jira/browse/BEAM-2650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2650: -- Assignee: (was: Davor Bonaci) > migrate beam_it_args -> beam_it_options > --- > > Key: BEAM-2650 > URL: https://issues.apache.org/jira/browse/BEAM-2650 > Project: Beam > Issue Type: Task > Components: testing >Reporter: Stephen Sisk > > When adding the mvn -> pkb -> mvn integration for the IO IT's usage of PKB, I > noticed that beam_it_args had two problems: > 1) args had a format for passing options that was sub-optimal > 2) the thing we are working with is options, not args, so it's mis-named. > It was important to solve #1, but you can't just change the name since it'd > break with the currently checked in jenkins job. So I needed to migrate away > from it, and #2 presented an easy opportunity to do so, so I added > beam_it_options as the new option. > We should remove usages of beam_it_args and migrate over to only > beam_it_options, then remove beam_it_args from pkb -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2121) Tests on MacOS
[ https://issues.apache.org/jira/browse/BEAM-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2121: -- Assignee: (was: Davor Bonaci) > Tests on MacOS > -- > > Key: BEAM-2121 > URL: https://issues.apache.org/jira/browse/BEAM-2121 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Ahmet Altay > > After removal of Travis testing, we lost the ability to test on Macs. I am > wondering if this is possible on Jenkins. A simple web search for "cloud > macos" shows many promising results. > cc: [~davor] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2220) Move org.apache.beam.sdk.util within google-cloud-platform-core to org.apache.beam.runners.dataflow
[ https://issues.apache.org/jira/browse/BEAM-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2220: -- Assignee: Luke Cwik (was: Davor Bonaci) > Move org.apache.beam.sdk.util within google-cloud-platform-core to > org.apache.beam.runners.dataflow > --- > > Key: BEAM-2220 > URL: https://issues.apache.org/jira/browse/BEAM-2220 > Project: Beam > Issue Type: Improvement > Components: sdk-java-gcp >Reporter: Luke Cwik >Assignee: Luke Cwik > > Move org.apache.beam.sdk.util within google-cloud-platform-core to underneath > org.apache.beam.runners.dataflow -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1609) Add support for Beam Metrics API to PKB
[ https://issues.apache.org/jira/browse/BEAM-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1609: -- Assignee: Jason Kuster (was: Davor Bonaci) > Add support for Beam Metrics API to PKB > --- > > Key: BEAM-1609 > URL: https://issues.apache.org/jira/browse/BEAM-1609 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Jason Kuster >Assignee: Jason Kuster > > See > https://docs.google.com/document/d/1PsjGPSN6FuorEEPrKEP3u3m16tyOzph5FnL2DhaRDz0/edit?ts=58a78e73#heading=h.exn0s6jsm24q > for more details on what this entails. > Blocked on BEAM-147 -- creation of metrics API. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1271) Develop Apache Accumulo IO
[ https://issues.apache.org/jira/browse/BEAM-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1271: -- Assignee: Reuven Lax (was: Davor Bonaci) > Develop Apache Accumulo IO > -- > > Key: BEAM-1271 > URL: https://issues.apache.org/jira/browse/BEAM-1271 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions >Affects Versions: Not applicable >Reporter: Wyatt Frelot >Assignee: Reuven Lax >Priority: Minor > Original Estimate: 672h > Remaining Estimate: 672h > > Develop the Apache Accumulo IO for write and read operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1271) Develop Apache Accumulo IO
[ https://issues.apache.org/jira/browse/BEAM-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1271: -- Assignee: Wyatt Frelot (was: Reuven Lax) > Develop Apache Accumulo IO > -- > > Key: BEAM-1271 > URL: https://issues.apache.org/jira/browse/BEAM-1271 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions >Affects Versions: Not applicable >Reporter: Wyatt Frelot >Assignee: Wyatt Frelot >Priority: Minor > Original Estimate: 672h > Remaining Estimate: 672h > > Develop the Apache Accumulo IO for write and read operations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1305) Support additional configuration in BigQueryServices.insertAll()
[ https://issues.apache.org/jira/browse/BEAM-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1305: -- Assignee: (was: Davor Bonaci) > Support additional configuration in BigQueryServices.insertAll() > > > Key: BEAM-1305 > URL: https://issues.apache.org/jira/browse/BEAM-1305 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: Pei He > > ignoreUnknownValues is requested in > https://issues.apache.org/jira/browse/BEAM-1267 > There are additional configurations that could be useful. > TableDataInsertAllRequest content = new TableDataInsertAllRequest(); > content.setSkipInvalidRows(); > content.setTemplateSuffix(); > content.setKind(); > > I think we can improve the BigQueryServices interface by define it as: > void insertAll(TableReference ref, Collection > request); > and, provided a static method to prepare requests: > List makeInsertBatches(List rowList, > @Nullable List insertIdList); > Then, client can set additional config in the returned list. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2752) Job fails to checkpoint with kinesis stream as an input for Flink job
[ https://issues.apache.org/jira/browse/BEAM-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2752: -- Assignee: Chamikara Jayalath (was: Davor Bonaci) > Job fails to checkpoint with kinesis stream as an input for Flink job > - > > Key: BEAM-2752 > URL: https://issues.apache.org/jira/browse/BEAM-2752 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Affects Versions: 2.0.0 >Reporter: Pawel Bartoszek >Assignee: Chamikara Jayalath >Priority: Minor > > Our job is reading from kinesis stream as a job input. Quiet often when the > job is checkpointing for the first time the exception is thrown: > The scenario the produces the exception: > # Upload a new jar file with job logic > # Start new job > # Stop the job with savepoint that is written to s3 > # Upload a new jar file with job logic(in this case the jar contains the same > code - but our pipeline generates new jar file name for every build) > # Start a new job from savepoint > # The first checkpoint fails causing the job to be cancelled > If the job is started without passing savepoint the checkpointing works fine. > Other information: > Flink version 1.2.1 > Beam 2.0.0 > Flink Parallelism - 20 slots > Number of task managers - 4 > Number of kinesis shards - 8 > {code:java} > java.lang.Exception: Error while triggering checkpoint 59 for Source: > Read(KinesisSource) -> Flat Map -> ParMultiDo(KinesisExtractor) -> Flat Map > -> ParMultiDo(StringToRecord) -> Flat Map -> ParMultiDo(Anonymous) -> Flat > Map -> ParMultiDo(ToRRecord) -> Flat Map -> ParMultiDo(AddTimestamps) -> Flat > Map -> ..GroupByOneMinuteWindow GROUP RDOTRECORDS BY ONE MINUTE > WINDOWS/Window.Assign.out -> (ParMultiDo(Anonymous) -> Flat Map -> > ParMultiDo(ToSomeKey) -> Flat Map -> ToKeyedWorkItem, > ParMultiDo(ToCompositeKey) -> Flat Map -> ParMultiDo(Anonymous) -> Flat Map > -> ToKeyedWorkItem, ParMultiDo(Anonymous) -> Flat Map -> > ParMultiDo(ApplyShardingKey) -> Flat Map -> ToKeyedWorkItem) (1/20) > at org.apache.flink.runtime.taskmanager.Task$3.run(Task.java:1136) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.Exception: Could not perform checkpoint 59 for operator > Source: Read(KinesisSource) -> Flat Map -> ParMultiDo(KinesisExtractor) -> > Flat Map -> ParMultiDo(StringToRecord) -> Flat Map -> ParMultiDo(Anonymous) > -> Flat Map -> ParMultiDo(ToRRecord) -> Flat Map -> ParMultiDo(AddTimestamps) > -> Flat Map -> ..GroupByOneMinuteWindow GROUP RDOTRECORDS BY ONE > MINUTE WINDOWS/Window.Assign.out -> (ParMultiDo(Anonymous) -> Flat Map -> > ParMultiDo(ToSomeKey) -> Flat Map -> ToKeyedWorkItem, > ParMultiDo(ToCompositeKey) -> Flat Map -> ParMultiDo(Anonymous) -> Flat Map > -> ToKeyedWorkItem, ParMultiDo(Anonymous) -> Flat Map -> > ParMultiDo(ApplyShardingKey) -> Flat Map -> ToKeyedWorkItem) (1/20). > at > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpoint(StreamTask.java:524) > at org.apache.flink.runtime.taskmanager.Task$3.run(Task.java:1125) > ... 5 more > Caused by: java.lang.Exception: Could not complete snapshot 59 for operator > Source: Read(KinesisSource) -> Flat Map -> ParMultiDo(KinesisExtractor) -> > Flat Map -> ParMultiDo(StringToRecord) -> Flat Map -> ParMultiDo(Anonymous) > -> Flat Map -> ParMultiDo(ToRRecord) -> Flat Map -> ParMultiDo(AddTimestamps) > -> Flat Map -> ..GroupByOneMinuteWindow GROUP RDOTRECORDS BY ONE > MINUTE WINDOWS/Window.Assign.out -> (ParMultiDo(Anonymous) -> Flat Map -> > ParMultiDo(ToSomeKey) -> Flat Map -> ToKeyedWorkItem, > ParMultiDo(ToCompositeKey) -> Flat Map -> ParMultiDo(Anonymous) -> Flat Map > -> ToKeyedWorkItem, ParMultiDo(Anonymous) -> Flat Map -> > ParMultiDo(ApplyShardingKey) -> Flat Map -> ToKeyedWorkItem) (1/20). > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:379) > at > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1157) > at > org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1090) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:630) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:575)
[jira] [Assigned] (BEAM-1664) Support Kafka0.8.x client in KafkaIO
[ https://issues.apache.org/jira/browse/BEAM-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1664: -- Assignee: Raghu Angadi (was: Davor Bonaci) > Support Kafka0.8.x client in KafkaIO > - > > Key: BEAM-1664 > URL: https://issues.apache.org/jira/browse/BEAM-1664 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: JiJun Tang >Assignee: Raghu Angadi > > Kafka-0.8 is not supported yet, these's a big change from 0.8 to 0.9. So we > need to create a specific KafkaIO moudle for 0.8. After complete this > moudle,we will consider to extract common code to kafkaio-common moudle. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2639) Unbounded Source for MongoDB
[ https://issues.apache.org/jira/browse/BEAM-2639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2639: -- Assignee: Jean-Baptiste Onofré (was: Davor Bonaci) > Unbounded Source for MongoDB > > > Key: BEAM-2639 > URL: https://issues.apache.org/jira/browse/BEAM-2639 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Affects Versions: 2.0.0 >Reporter: nevi_me >Assignee: Jean-Baptiste Onofré >Priority: Minor > > The current MongoDB source is bounded, which means that we can't build > streaming pipelines directly from MongoDB. > MongoDB publishes changes in each collection through the oplog. Would it be > possible to create a connector that reads the oplog to create an unbounded > source? > As an oplog is only available through replication, this creates that > dependency. We would need to also consider whether a polling method (using > the ObjectId) could be an appropriate fallback. > Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1879) PTransform style guide should discuss display data
[ https://issues.apache.org/jira/browse/BEAM-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1879: -- Assignee: Eugene Kirpichov (was: Davor Bonaci) > PTransform style guide should discuss display data > -- > > Key: BEAM-1879 > URL: https://issues.apache.org/jira/browse/BEAM-1879 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: Stephen Sisk >Assignee: Eugene Kirpichov > > Currently, the PTransform style guide > (https://beam.apache.org/contribute/ptransform-style-guide/) does not discuss > display data at all. > We should make sure to discuss testing display data - specifically that > using DisplayDataEvaluator is a best practice for testing since without it, > you cannot tell whether or not the display data will actually be displayed. > cc [~swegner] [~bchambers] [~jkff] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2582) KinesisIO incorrectly handles closed shards
[ https://issues.apache.org/jira/browse/BEAM-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2582: -- Assignee: Chamikara Jayalath (was: Davor Bonaci) > KinesisIO incorrectly handles closed shards > --- > > Key: BEAM-2582 > URL: https://issues.apache.org/jira/browse/BEAM-2582 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Reporter: adam gray >Assignee: Chamikara Jayalath > > The KinesisIO throws an exception when consuming closed Kinesis shards, which > return null from `GetShardIterator`, as it tries to call `GetRecords` with > the null `shardIterator` value instead of abandoning the closed shard. > This means KinesisIO fails after re-sharding a stream with an exception like > the following: > {noformat} > Exception in thread "main" java.lang.RuntimeException: Kinesis client side > failure > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:151) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getRecords(SimplifiedKinesisClient.java:115) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getRecords(SimplifiedKinesisClient.java:102) > at > org.apache.beam.sdk.io.kinesis.ShardRecordsIterator.readMoreIfNecessary(ShardRecordsIterator.java:79) > at > org.apache.beam.sdk.io.kinesis.ShardRecordsIterator.next(ShardRecordsIterator.java:64) > at > org.apache.beam.sdk.io.kinesis.KinesisReader.advance(KinesisReader.java:86) > at > org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.startReader(UnboundedReadEvaluatorFactory.java:190) > at > org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:128) > at > org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139) > at > org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: com.amazonaws.AmazonServiceException: 1 validation error detected: > Value null at 'shardIterator' failed to satisfy constraint: Member must not > be null (Service: AmazonKinesis; Status Code: 400; Error Code: > ValidationException; Request ID: d764e747-9616-5db3-86ba-08a0bc44cb39) > at > com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1378) > at > com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:924) > at > com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:702) > at > com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:454) > at > com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:416) > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:365) > at > com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2016) > at > com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:1986) > at > com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:985) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient$3.call(SimplifiedKinesisClient.java:118) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient$3.call(SimplifiedKinesisClient.java:115) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:140) > ... 14 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2703) KafkaIO: watermark outside the bounds of BoundedWindow
[ https://issues.apache.org/jira/browse/BEAM-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2703: -- Assignee: Raghu Angadi (was: Davor Bonaci) > KafkaIO: watermark outside the bounds of BoundedWindow > -- > > Key: BEAM-2703 > URL: https://issues.apache.org/jira/browse/BEAM-2703 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Reporter: Chris Pettitt >Assignee: Raghu Angadi > > KafkaIO appears to use an incorrect lower bound for it's initial watermark > with respect to BoundedWindow.TIMESTAMP_MIN_VALUE. > KafkaIO's initial watermark: > new Instant(Long.MIN_VALUE) -> -9223372036854775808 > BoundedWindow.TIMESTAMP_MIN_VALUE: > new Instant(TimeUnit.MICROSECONDS.toMillis(Long.MIN_VALUE)) -> > -9223372036854775 > The difference is that the last three digits have been truncated due to the > micro to millis conversion. > This difference can cause errors in runners that assert that the input > watermark can never regress as KafkaIO gives a value below the lower bound > when no messages have been received yet. For consistency it would probably be > best for it to use BoundedWindow.TIMESTAMP_MIN_VALUE. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2704) KafkaIO: NPE without key serializer set
[ https://issues.apache.org/jira/browse/BEAM-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2704: -- Assignee: Raghu Angadi (was: Davor Bonaci) > KafkaIO: NPE without key serializer set > --- > > Key: BEAM-2704 > URL: https://issues.apache.org/jira/browse/BEAM-2704 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Reporter: Chris Pettitt >Assignee: Raghu Angadi > > The KafkaIO javadoc implies that you do not need to set a Serializer if you > only want to emit values: > {code} > * Often you might want to write just values without any keys to Kafka. > Use {@code values()} to > * write records with default empty(null) key: > * > * {@code > * PCollection strings = ...; > * strings.apply(KafkaIO.write() > * .withBootstrapServers("broker_1:9092,broker_2:9092") > * .withTopic("results") > * .withValueSerializer(new StringSerializer()) // just need serializer > for value > * .values() > *); > * } > {code} > However, if you don't set the key serializer then Kafka blows up when trying > to instantiate the key serializer (in Kafka 0.10.1, at least). It would be > more convenient if KafkaIO worked as documented and assigned a null > serializer if values() is used. > Relevant stack trace: > {code} > Caused by: java.lang.NullPointerException > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:230) > at > org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:163) > at > org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter.setup(KafkaIO.java:1582) > at > org.apache.beam.sdk.io.kafka.KafkaIO$KafkaWriter$DoFnInvoker.invokeSetup(Unknown > Source) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2605) Exception raised when using MongoDBIO.Write in streaming mode
[ https://issues.apache.org/jira/browse/BEAM-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2605: -- Assignee: Jean-Baptiste Onofré (was: Davor Bonaci) > Exception raised when using MongoDBIO.Write in streaming mode > - > > Key: BEAM-2605 > URL: https://issues.apache.org/jira/browse/BEAM-2605 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Affects Versions: 2.0.0 >Reporter: Pascal Castéran >Assignee: Jean-Baptiste Onofré > > In org.apache.beam.sdk.io.mongodb.MongoDbIO.Write.WriteFn#flush(), no check > is done on the size of the batch list of documents before executing the > _*insertMany*_ operation. > In streaming mode, when processing an empty pane, an empty list of documents > can be passed to the MongoDB client which results in the following exception: > {quote}java.lang.IllegalArgumentException: state should be: writes is not an > empty list > at com.mongodb.assertions.Assertions.isTrueArgument(Assertions.java:99) > at > com.mongodb.operation.MixedBulkWriteOperation.(MixedBulkWriteOperation.java:95) > at > com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:323) > at > com.mongodb.MongoCollectionImpl.insertMany(MongoCollectionImpl.java:311) > at > org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.flush(MongoDbIO.java:513) > at > org.apache.beam.sdk.io.mongodb.MongoDbIO$Write$WriteFn.finishBundle(MongoDbIO.java:506){quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2737) Test SSL/TLS and authentication in Elasticsearch integration tests
[ https://issues.apache.org/jira/browse/BEAM-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2737: -- Assignee: Etienne Chauchot (was: Davor Bonaci) > Test SSL/TLS and authentication in Elasticsearch integration tests > -- > > Key: BEAM-2737 > URL: https://issues.apache.org/jira/browse/BEAM-2737 > Project: Beam > Issue Type: Test > Components: sdk-java-extensions >Reporter: Etienne Chauchot >Assignee: Etienne Chauchot > > Testing authentication and SSL/TLS communication requires to setup shield and > certificates. This is not doable in the embedded Elasticsearch used for > UTests. So, do this tests as integration tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2738) Setup shield and certificates in Elasticseach ITests backend server
[ https://issues.apache.org/jira/browse/BEAM-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2738: -- Assignee: Etienne Chauchot (was: Davor Bonaci) > Setup shield and certificates in Elasticseach ITests backend server > --- > > Key: BEAM-2738 > URL: https://issues.apache.org/jira/browse/BEAM-2738 > Project: Beam > Issue Type: Sub-task > Components: sdk-java-extensions >Reporter: Etienne Chauchot >Assignee: Etienne Chauchot > > Linnks to official documentation: > https://www.elastic.co/guide/en/shield/current/ssl-tls.html > https://www.elastic.co/guide/en/shield/current/native-realm.html#managing-native-users -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (BEAM-2721) Augment BeamRecordType to do slicing and concatenation.
[ https://issues.apache.org/jira/browse/BEAM-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci updated BEAM-2721: --- Component/s: (was: sdk-java-extensions) dsl-sql > Augment BeamRecordType to do slicing and concatenation. > --- > > Key: BEAM-2721 > URL: https://issues.apache.org/jira/browse/BEAM-2721 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Robert Bradshaw >Assignee: Xu Mingmin > > Currently in several places we cast to BeamSqlRecordType, extract the field > type ints, do the slicing, and then reconstruct a new BeamSqlRecordType. If > BeamRecordType had polymorphic methods to slice/concat this would be cleaner > and more flexible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2577) IO tests should exercise Runtime Values where supported
[ https://issues.apache.org/jira/browse/BEAM-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2577: -- Assignee: (was: Davor Bonaci) > IO tests should exercise Runtime Values where supported > --- > > Key: BEAM-2577 > URL: https://issues.apache.org/jira/browse/BEAM-2577 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, testing >Reporter: Ben Chambers > > The only tests I have found for `ValueProvider` parameterized methods is > that they are not evaluated during pipeline construction time. This is > missing out on several important pieces: > 1. > https://stackoverflow.com/questions/44967898/notify-when-textio-is-done-writing-a-file > seems to be a problem with an AvroIO write using a RuntimeValueProvider > being non-serializable (current theory is because of an anonymous inner class > capturing the enclosing AvroIO.Write instance which has non-serializable > fields). > 2. Testing that the code paths that actually read the file do so correctly > when parameterized. > We should update the developer documentation to describe what the > requirements are for a parameterized IO and provide guidance on what tests > are needed and how to write them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2721) Augment BeamRecordType to do slicing and concatenation.
[ https://issues.apache.org/jira/browse/BEAM-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2721: -- Assignee: Xu Mingmin (was: Davor Bonaci) > Augment BeamRecordType to do slicing and concatenation. > --- > > Key: BEAM-2721 > URL: https://issues.apache.org/jira/browse/BEAM-2721 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Robert Bradshaw >Assignee: Xu Mingmin > > Currently in several places we cast to BeamSqlRecordType, extract the field > type ints, do the slicing, and then reconstruct a new BeamSqlRecordType. If > BeamRecordType had polymorphic methods to slice/concat this would be cleaner > and more flexible. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-449) Support PCollectionList in PAssert
[ https://issues.apache.org/jira/browse/BEAM-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-449: - Assignee: (was: Davor Bonaci) > Support PCollectionList in PAssert > -- > > Key: BEAM-449 > URL: https://issues.apache.org/jira/browse/BEAM-449 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Thomas Groh >Priority: Minor > > The assertion takes an input PCollectionList and takes a list of matchers of > the same size, and applies each matcher to the identical index of the > PCollectionList > e.g. PAssert.that(PCollectionList[0]).satisfies(matchers[0]) > Potentially also worthwhile is a "PAssert.thatFlattened(PCollectionList)" > static constructor, that runs an assertion on the flattened contents of the > list. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-448) Print Properties on validation failures in PipelineOptionsValidator
[ https://issues.apache.org/jira/browse/BEAM-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-448: - Assignee: (was: Davor Bonaci) > Print Properties on validation failures in PipelineOptionsValidator > --- > > Key: BEAM-448 > URL: https://issues.apache.org/jira/browse/BEAM-448 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Thomas Groh >Priority: Minor > > If Pipeline Validation fails in the pipeline options Validator, the methods > that failed validation are printed. Instead, the property names (passed on > the command line) should be printed for consistency with the > PipelineOptionsFactory. > Currently methods are printed at > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsValidator.java#L72 > PipelineOptionsReflector currently extracts the property -> method mapping > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsReflector.java#L95 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1408) outputWithTimestamp() accepts timestamps that will fail preconditions
[ https://issues.apache.org/jira/browse/BEAM-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1408: -- Assignee: (was: Davor Bonaci) > outputWithTimestamp() accepts timestamps that will fail preconditions > - > > Key: BEAM-1408 > URL: https://issues.apache.org/jira/browse/BEAM-1408 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Andy Xu >Priority: Minor > > We have accidentally created events with *wrong* timestamps in the future > which are accepted by > outputWithTimestamp(), but will fail at a later step: > java.lang.IllegalStateException: Timer 472976-06-15T20:09:57.269Z is beyond > end-of-time > at Preconditions.checkState(Preconditions.java:199) > at > ReduceFnRunner.scheduleEndOfWindowOrGarbageCollectionTimer(ReduceFnRunner.java:1050) > [...] > Would it make sense to implement a check already at outputWithTimestamp() > level to fail early? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1639) Catch bad FractionConsumed values in the beam SDK.
[ https://issues.apache.org/jira/browse/BEAM-1639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1639: -- Assignee: (was: Davor Bonaci) > Catch bad FractionConsumed values in the beam SDK. > -- > > Key: BEAM-1639 > URL: https://issues.apache.org/jira/browse/BEAM-1639 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Stephen Sisk >Priority: Minor > > getFractionConsumed in Sources are expected to return values between 0.0 and > 1.0 > I recently encountered a bug where a bad value of fractionConsumed was sent > to a runner. Looking through the beam source, I couldn't find anywhere that > we validate the value before we send it (it could be there and I didn't see > it, but given that I saw this error with a real source, I don't believe it > is.) > I think it'd be useful if the beam SDK caught this error at the time the > value is returned and at least generate a useful error in the logs (thus > allowing the user to more easily debug the issue) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-728) Javadoc should clearly separate facts from runner requirements
[ https://issues.apache.org/jira/browse/BEAM-728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-728: - Assignee: (was: Davor Bonaci) > Javadoc should clearly separate facts from runner requirements > -- > > Key: BEAM-728 > URL: https://issues.apache.org/jira/browse/BEAM-728 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Frances Perry > > The javadoc for View.asMap() says the map needs to fit in memory. That's not > true in all runners. (For example, Dataflow has distributed map support.) > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java > This is likely just one specific case of a more general issue -- different > runners will have common constraints on the scalability of portions of the > model. Currently these are documented in the capability matrix on the > website, but for usability we should consider surfacing these constraints on > particularly relevant methods. But keeping things in sync in multiple > locations is hard... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1662) Re. BEAM-974 - PubSub.read/write() .withCoder requirement should raise a more informative error in Dataflow
[ https://issues.apache.org/jira/browse/BEAM-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1662: -- Assignee: (was: Davor Bonaci) > Re. BEAM-974 - PubSub.read/write() .withCoder requirement should raise a more > informative error in Dataflow > --- > > Key: BEAM-1662 > URL: https://issues.apache.org/jira/browse/BEAM-1662 > Project: Beam > Issue Type: Wish > Components: sdk-java-core > Environment: google cloud / dataflow >Reporter: G Money >Priority: Minor > > Hello, > I'm a Google Cloud Support Specialist who recently owned a case from a > platinum customer who was reporting "validation of workflow failed" internal > error when attempting to use new beta SDK. Eventually, it was determined > that the problem was that they weren't using .withCoder, which became > required after BEAM-974[1]. They would like to request that a more > informative error be thrown, as the current one is far too vague to be able > to derive any useful information from. Thank you. > Regards, > Garrett Anderson > Cloud Support Specialist > Google Cloud Platform Support > [1] https://issues.apache.org/jira/browse/BEAM-974 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-1461) duplication with StartBundle and prepareForProcessing in DoFn
[ https://issues.apache.org/jira/browse/BEAM-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140695#comment-16140695 ] Davor Bonaci commented on BEAM-1461: Seems complete; I'd resolve. Please reactivate if there's more work to do. > duplication with StartBundle and prepareForProcessing in DoFn > - > > Key: BEAM-1461 > URL: https://issues.apache.org/jira/browse/BEAM-1461 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Xu Mingmin >Assignee: Davor Bonaci > Fix For: Not applicable > > > There're one annotation `StartBundle`, and one public function > `prepareForProcessing` in DoFn, which are called both before > `ProcessElement`. It's confused which one should be implemented in a subclass. > The call sequence seems as: > prepareForProcessing -> StartBundle -> processElement -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (BEAM-1461) duplication with StartBundle and prepareForProcessing in DoFn
[ https://issues.apache.org/jira/browse/BEAM-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci resolved BEAM-1461. Resolution: Fixed Fix Version/s: Not applicable > duplication with StartBundle and prepareForProcessing in DoFn > - > > Key: BEAM-1461 > URL: https://issues.apache.org/jira/browse/BEAM-1461 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Xu Mingmin >Assignee: Davor Bonaci > Fix For: Not applicable > > > There're one annotation `StartBundle`, and one public function > `prepareForProcessing` in DoFn, which are called both before > `ProcessElement`. It's confused which one should be implemented in a subclass. > The call sequence seems as: > prepareForProcessing -> StartBundle -> processElement -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (BEAM-1691) Dynamic properties supported in PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci resolved BEAM-1691. Resolution: Not A Problem Fix Version/s: Not applicable > Dynamic properties supported in PipelineOptions > --- > > Key: BEAM-1691 > URL: https://issues.apache.org/jira/browse/BEAM-1691 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Xu Mingmin >Assignee: Davor Bonaci > Fix For: Not applicable > > > Usually the two lines to create a new Beam pipeline are: > {code} > Options options = > PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class); > Pipeline pipeline = Pipeline.create(options); > {code} > As each runner has its own PipelineOptions, one piece of code is hardly to > run on different runners without code change, --as least Options needs to be > updated. > Dynamic property could be a choice, similar as > {code} > -D property1=value1 -D property2=value2 ... > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-1691) Dynamic properties supported in PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140693#comment-16140693 ] Davor Bonaci commented on BEAM-1691: I'd resolve for now; if we want to change the behavior here, perhaps there should be a dev@ discussion first. > Dynamic properties supported in PipelineOptions > --- > > Key: BEAM-1691 > URL: https://issues.apache.org/jira/browse/BEAM-1691 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Xu Mingmin >Assignee: Davor Bonaci > Fix For: Not applicable > > > Usually the two lines to create a new Beam pipeline are: > {code} > Options options = > PipelineOptionsFactory.fromArgs(args).withValidation().as(Options.class); > Pipeline pipeline = Pipeline.create(options); > {code} > As each runner has its own PipelineOptions, one piece of code is hardly to > run on different runners without code change, --as least Options needs to be > updated. > Dynamic property could be a choice, similar as > {code} > -D property1=value1 -D property2=value2 ... > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1898) Need a SerializableThrowable for PAssert to capture the point of an assertion
[ https://issues.apache.org/jira/browse/BEAM-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1898: -- Assignee: (was: Davor Bonaci) > Need a SerializableThrowable for PAssert to capture the point of an assertion > - > > Key: BEAM-1898 > URL: https://issues.apache.org/jira/browse/BEAM-1898 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Pablo Estrada > > For the regular Error class in Java, its stack trace is a transient > attribute, so it's not serialized by any framework. We need a class that > allows us to serialize these AssertionErrors to have them in PCollections. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1448) Coder encode/decode context documentation is lacking
[ https://issues.apache.org/jira/browse/BEAM-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1448: -- Assignee: (was: Davor Bonaci) > Coder encode/decode context documentation is lacking > > > Key: BEAM-1448 > URL: https://issues.apache.org/jira/browse/BEAM-1448 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Aviem Zur > Labels: documentation > > Coder encode/decode context documentation is lacking. > * Documentation of {{Coder}} methods {{encode}} and {{decode}} should include > description of {{context}} argument and explain how to relate to it when > implementing. > * Consider renaming the static {{Context}} values {{NESTED}} and {{OUTER}} to > more accurate names. > * Emphasize the use of CoderProperties as the best way to test a coder. > [Original dev list > discussion|https://lists.apache.org/thread.html/fbd2d6b869ac2b0225ec39461b14158a03f304a930782d39ac9a60a6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2303) Add SpecificData to AvroCoder
[ https://issues.apache.org/jira/browse/BEAM-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2303: -- Assignee: (was: Davor Bonaci) > Add SpecificData to AvroCoder > - > > Key: BEAM-2303 > URL: https://issues.apache.org/jira/browse/BEAM-2303 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: 2.1.0 >Reporter: Arvid Heise > > The AvroCoder currently supports GenericData and ReflectData, but not > SpecificData. > It should relatively easy to incorporate it by expanding the logic while > constructing the Reader and Writer by also checking if the type implements > the SpecificRecord interface. It would greatly speed up (de-)serialization of > Avro-generated java classes. > {code} > return myCoder.getType().equals(GenericRecord.class) > ? new GenericDatumReader(myCoder.getSchema()) > : new ReflectDatumReader( > myCoder.getSchema(), myCoder.getSchema(), > myCoder.reflectData.get()); > {code} > should be > {code} > if (myCoder.getType().equals(GenericRecord.class)) { > return new > GenericDatumReader(myCoder.getSchema()); > } > if > (SpecificRecord.class.isAssignableFrom(myCoder.getType())) { > return new > SpecificDatumReader(myCoder.getType()); > } > return new ReflectDatumReader( > myCoder.getSchema(), myCoder.getSchema(), > myCoder.reflectData.get()); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2069) Remove ResourceId.getCurrentDirectory()?
[ https://issues.apache.org/jira/browse/BEAM-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2069: -- Assignee: (was: Davor Bonaci) > Remove ResourceId.getCurrentDirectory()? > > > Key: BEAM-2069 > URL: https://issues.apache.org/jira/browse/BEAM-2069 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.0.0 >Reporter: Stephen Sisk > Labels: backward-incompatible > > Beam ResourceId currently has a getCurrentDirectory method that returns the > current resource id if it's a directory, or the parent directory if it's a > directory. > To implement this you need to know whether or not a particular path is a > directory or not. > I'm trying to implement the Hadoop ResourceId implementation, and it's not > clear if it's possible. Hadoop's Paths do not end a / if they are a directory > (they are stripped), nor do hadoop paths tell you if something is a > directory, so it's not possible to determine if a given path is a file that > does not have a suffix, or a directory. > It's not clear to me that all file systems can determine whether a path is a > directory and thus I don't believe it can be implemented reliably. > The only usages of getCurrentDirectory that I could find are in tests so it's > not clear we actually need this. > I propose that we remove this method. > cc [~davor] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2302) WriteFiles with runner-determined sharding and large numbers of windows causes OOM errors
[ https://issues.apache.org/jira/browse/BEAM-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2302: -- Assignee: Reuven Lax (was: Davor Bonaci) > WriteFiles with runner-determined sharding and large numbers of windows > causes OOM errors > - > > Key: BEAM-2302 > URL: https://issues.apache.org/jira/browse/BEAM-2302 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax > > This is because the WriteWindowedBundles transform will create many file > writers, and the sheer number of file buffers (which defaults to 64mb per > writer) uses up all memory. The fix is the same as was done in BigQueryIO - > if too many writers are opened, spill into a shuffle, and write the files > after the shuffle -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2302) WriteFiles with runner-determined sharding and large numbers of windows causes OOM errors
[ https://issues.apache.org/jira/browse/BEAM-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140689#comment-16140689 ] Davor Bonaci commented on BEAM-2302: Fixed? > WriteFiles with runner-determined sharding and large numbers of windows > causes OOM errors > - > > Key: BEAM-2302 > URL: https://issues.apache.org/jira/browse/BEAM-2302 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Davor Bonaci > > This is because the WriteWindowedBundles transform will create many file > writers, and the sheer number of file buffers (which defaults to 64mb per > writer) uses up all memory. The fix is the same as was done in BigQueryIO - > if too many writers are opened, spill into a shuffle, and write the files > after the shuffle -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2615) Add ViewTests with SlidingWindows
[ https://issues.apache.org/jira/browse/BEAM-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2615: -- Assignee: Thomas Groh (was: Davor Bonaci) > Add ViewTests with SlidingWindows > - > > Key: BEAM-2615 > URL: https://issues.apache.org/jira/browse/BEAM-2615 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh >Assignee: Thomas Groh > > For both reading and writing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2601) FileBasedSink produces incorrect shards when writing to multiple destinations
[ https://issues.apache.org/jira/browse/BEAM-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140688#comment-16140688 ] Davor Bonaci commented on BEAM-2601: Fixed? > FileBasedSink produces incorrect shards when writing to multiple destinations > - > > Key: BEAM-2601 > URL: https://issues.apache.org/jira/browse/BEAM-2601 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax > Fix For: 2.2.0 > > > FileBasedSink now supports multiple dynamic destinations, however it > finalizes all files in a bundle without paying attention to destination. This > means that the shard counts will be incorrect across these destinations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2601) FileBasedSink produces incorrect shards when writing to multiple destinations
[ https://issues.apache.org/jira/browse/BEAM-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2601: -- Assignee: Reuven Lax (was: Davor Bonaci) > FileBasedSink produces incorrect shards when writing to multiple destinations > - > > Key: BEAM-2601 > URL: https://issues.apache.org/jira/browse/BEAM-2601 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax > Fix For: 2.2.0 > > > FileBasedSink now supports multiple dynamic destinations, however it > finalizes all files in a bundle without paying attention to destination. This > means that the shard counts will be incorrect across these destinations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2624) File-based sinks should produce a PCollection of written filenames
[ https://issues.apache.org/jira/browse/BEAM-2624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2624: -- Assignee: Reuven Lax (was: Davor Bonaci) > File-based sinks should produce a PCollection of written filenames > -- > > Key: BEAM-2624 > URL: https://issues.apache.org/jira/browse/BEAM-2624 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Reuven Lax >Assignee: Reuven Lax > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2701) use a custom implementation of java.io.ObjectInputStream
[ https://issues.apache.org/jira/browse/BEAM-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2701: -- Assignee: Luke Cwik (was: Davor Bonaci) > use a custom implementation of java.io.ObjectInputStream > > > Key: BEAM-2701 > URL: https://issues.apache.org/jira/browse/BEAM-2701 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Luke Cwik > > java.io.ObjectInputStream should override resolve[Proxy]Class using the TCCL > to support any classloader and not fallback into some JVM pitfall using > another classloader (default). This will enable beam to use any classloader > instead of requiring to run in the JVM using java serialization. > {code} > @Override > protected Class resolveClass(final ObjectStreamClass classDesc) throws > IOException, ClassNotFoundException { > final String n = classDesc.getName(); > final ClassLoader classloader = getClassloader(); > try { > return Class.forName(n, false, classloader); > } catch (ClassNotFoundException e) { > if (n.equals("boolean")) { > return boolean.class; > } > if (n.equals("byte")) { > return byte.class; > } > if (n.equals("char")) { > return char.class; > } > if (n.equals("short")) { > return short.class; > } > if (n.equals("int")) { > return int.class; > } > if (n.equals("long")) { > return long.class; > } > if (n.equals("float")) { > return float.class; > } > if (n.equals("double")) { > return double.class; > } > //Last try - Let runtime try and find it. > return Class.forName(n, false, null); > } > } > @Override > protected Class resolveProxyClass(final String[] interfaces) throws > IOException, ClassNotFoundException { > final Class[] cinterfaces = new Class[interfaces.length]; > for (int i = 0; i < interfaces.length; i++) { > cinterfaces[i] = getClassloader().loadClass(interfaces[i]); > } > try { > return Proxy.getProxyClass(getClassloader(), cinterfaces); > } catch (IllegalArgumentException e) { > throw new ClassNotFoundException(null, e); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2705) DoFnTester supports for StateParameter
[ https://issues.apache.org/jira/browse/BEAM-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2705: -- Assignee: Kenneth Knowles (was: Davor Bonaci) > DoFnTester supports for StateParameter > -- > > Key: BEAM-2705 > URL: https://issues.apache.org/jira/browse/BEAM-2705 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: 2.0.0 >Reporter: Yihua Eric Fang >Assignee: Kenneth Knowles > > Today DoFnTester does not support StateParameters such as ValueState. I > didn't see an issue being created on JIRA, so filing this one. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2751) Write PCollection elements to individual files
[ https://issues.apache.org/jira/browse/BEAM-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2751: -- Assignee: Eugene Kirpichov (was: Davor Bonaci) > Write PCollection elements to individual files > -- > > Key: BEAM-2751 > URL: https://issues.apache.org/jira/browse/BEAM-2751 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Christopher Hebert >Assignee: Eugene Kirpichov > > I'd like to write elements as individual files. > Rather than smashing thousands of outputs into a handful of files as TextIO > does (output-0-of-5, output-1-of-5,...), I want to write each > element into unique files. > So if I used WholeFileIO from [BEAM-2750] to read in three files (hi.txt, > what.txt, and yes.txt) then I'd like to write the processed files out to > individual files with user or data-defined filenames (like hi-modified.txt, > what-modified.txt, and yes-modified.txt). > With a WholeFileIO, this would look like: > {code:java} > PCollection> fileNamesAndBytes = p.apply("Read", > WholeFileIO.read().from("/path/to/input/dir/*")); > ... > // Do stuff that change contents and file names > PCollection> modifedFileNamesAndBytes = ... > ... > modifedFileNamesAndBytes.apply("Write", > WholeFileIO.write().to("/path/to/output/dir/")); > {code} > This ticket complements [BEAM-2750]. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2644) Make it easier to test runtime-accessible ValueProvider's
[ https://issues.apache.org/jira/browse/BEAM-2644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2644: -- Assignee: Eugene Kirpichov (was: Davor Bonaci) > Make it easier to test runtime-accessible ValueProvider's > - > > Key: BEAM-2644 > URL: https://issues.apache.org/jira/browse/BEAM-2644 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Eugene Kirpichov >Assignee: Eugene Kirpichov > > Many transforms that take ValueProvider's have different codepaths for when > the provider is accessible or not. However, as far as I can tell, there is no > good way to construct a transform with an inaccessible ValueProvider, and > then test how it runs with an actual value supplied. > The only way I could come up with is mimicking > https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java#L202 > , which is very ugly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2646) Problem to join in slack channel
[ https://issues.apache.org/jira/browse/BEAM-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140680#comment-16140680 ] Davor Bonaci commented on BEAM-2646: Same thing -- in addition to emailing dev-subscr...@beam.apache.org, please email user-subscr...@beam.apache.org. > Problem to join in slack channel > > > Key: BEAM-2646 > URL: https://issues.apache.org/jira/browse/BEAM-2646 > Project: Beam > Issue Type: Wish > Components: project-management >Reporter: Kwang-in (Dennis) JUNG >Assignee: Davor Bonaci >Priority: Trivial > Fix For: Not applicable > > > Hello. > While following up the guide in main page, I faced on few problem so tried to > join in slack to ask. But it keeps sending failure mail after I request slack > invitation through mail. > -- > Hi. This is the qmail-send program at apache.org. > I'm afraid I wasn't able to deliver your message to the following addresses. > This is a permanent error; I've given up. Sorry it didn't work out. > : > Must be sent from an @apache.org address or a subscriber address or an > address in LDAP. > --- Below this line is a copy of the message. > Return-Path: > Received: (qmail 28881 invoked by uid 99); 20 Jul 2017 09:45:11 - > Received: from pnap-us-west-generic-nat.apache.org (HELO > spamd1-us-west.apache.org) (209.188.14.142) > by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jul 2017 09:45:11 + > Received: from localhost (localhost [127.0.0.1]) > by spamd1-us-west.apache.org (ASF Mail Server at > spamd1-us-west.apache.org) with ESMTP id DA3E0C33A9 > for ; Thu, 20 Jul 2017 09:45:10 + (UTC) > X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org > X-Spam-Flag: NO > X-Spam-Score: 2.629 > X-Spam-Level: ** > ... > -- > Thanks! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (BEAM-2646) Problem to join in slack channel
[ https://issues.apache.org/jira/browse/BEAM-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci resolved BEAM-2646. Resolution: Information Provided Fix Version/s: Not applicable > Problem to join in slack channel > > > Key: BEAM-2646 > URL: https://issues.apache.org/jira/browse/BEAM-2646 > Project: Beam > Issue Type: Wish > Components: project-management >Reporter: Kwang-in (Dennis) JUNG >Assignee: Davor Bonaci >Priority: Trivial > Fix For: Not applicable > > > Hello. > While following up the guide in main page, I faced on few problem so tried to > join in slack to ask. But it keeps sending failure mail after I request slack > invitation through mail. > -- > Hi. This is the qmail-send program at apache.org. > I'm afraid I wasn't able to deliver your message to the following addresses. > This is a permanent error; I've given up. Sorry it didn't work out. > : > Must be sent from an @apache.org address or a subscriber address or an > address in LDAP. > --- Below this line is a copy of the message. > Return-Path: > Received: (qmail 28881 invoked by uid 99); 20 Jul 2017 09:45:11 - > Received: from pnap-us-west-generic-nat.apache.org (HELO > spamd1-us-west.apache.org) (209.188.14.142) > by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jul 2017 09:45:11 + > Received: from localhost (localhost [127.0.0.1]) > by spamd1-us-west.apache.org (ASF Mail Server at > spamd1-us-west.apache.org) with ESMTP id DA3E0C33A9 > for ; Thu, 20 Jul 2017 09:45:10 + (UTC) > X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org > X-Spam-Flag: NO > X-Spam-Score: 2.629 > X-Spam-Level: ** > ... > -- > Thanks! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1449) Maven should be able to build and run tests across multiple languages
[ https://issues.apache.org/jira/browse/BEAM-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-1449: -- Assignee: Ahmet Altay (was: Davor Bonaci) > Maven should be able to build and run tests across multiple languages > - > > Key: BEAM-1449 > URL: https://issues.apache.org/jira/browse/BEAM-1449 > Project: Beam > Issue Type: New Feature > Components: build-system >Affects Versions: Not applicable >Reporter: Sourabh Bajaj >Assignee: Ahmet Altay > > We are working on splitting python sdk into multiple packages and it'll be a > good time to invest in getting a central build tool to work across languages > so that we can share common test specs, common proto etc. > PS: Please revert the spec duplication > https://github.com/apache/beam/pull/1964 once it can be shared. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (BEAM-1809) Fix one error in TravisCI build
[ https://issues.apache.org/jira/browse/BEAM-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci resolved BEAM-1809. Resolution: Won't Fix Fix Version/s: Not applicable Travis-CI has been fully replaced with Jenkins. > Fix one error in TravisCI build > --- > > Key: BEAM-1809 > URL: https://issues.apache.org/jira/browse/BEAM-1809 > Project: Beam > Issue Type: Bug > Components: build-system >Affects Versions: 0.6.0 >Reporter: Wesley Tanaka >Assignee: Davor Bonaci > Fix For: Not applicable > > > TravisCI builds are failing on: > 2017-03-25T17:49:22.852 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check (default) on > project beam-sdks-parent: Failed during checkstyle execution: Unable to find > suppressions file at location: beam/suppressions.xml: Could not find resource > 'beam/suppressions.xml'. -> [Help 1] > e.g. https://travis-ci.org/apache/beam/jobs/215023943 > See https://github.com/apache/beam/pull/2326 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2793) Include all versions in top-level pom
[ https://issues.apache.org/jira/browse/BEAM-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2793: -- Assignee: Jason Kuster (was: Davor Bonaci) > Include all versions in top-level pom > - > > Key: BEAM-2793 > URL: https://issues.apache.org/jira/browse/BEAM-2793 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Jason Kuster >Assignee: Jason Kuster > > Top-level pom is missing at least > kafka version > cloudlogging version -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2581) KinesisClientProvider interface needs to be public
[ https://issues.apache.org/jira/browse/BEAM-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2581: -- Assignee: Chamikara Jayalath (was: Davor Bonaci) > KinesisClientProvider interface needs to be public > -- > > Key: BEAM-2581 > URL: https://issues.apache.org/jira/browse/BEAM-2581 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Affects Versions: 2.0.0 >Reporter: Nawaid Shamim >Assignee: Chamikara Jayalath >Priority: Minor > > Using Beam to read from kinesis stream. _KinesisIO_ provides two overloaded > methods - _withClientProvider_ to provide AWS credentials or implement an > interface - _KinesisClientProvider_ to pass _AWSKinesisClient_ as described > on [here|https://beam.apache.org/documentation/sdks/javadoc/2.0.0/] > {code} > There's also possibility to start reading using arbitrary point in time - in > this case you need to provide Instant object: > p.apply(KinesisIO.read() > .from("streamName", instant) > .withClientProvider(new KinesisClientProvider() { > @Override > public AmazonKinesis get() { > return null; > } > }) > .apply( ... ) // other transformations > {code} > The above code requires org.apache.beam.sdk.io.kinesis.KinesisClientProvider > interface to be public. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (BEAM-2581) KinesisClientProvider interface needs to be public
[ https://issues.apache.org/jira/browse/BEAM-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci updated BEAM-2581: --- Component/s: (was: sdk-java-core) (was: beam-model) sdk-java-extensions > KinesisClientProvider interface needs to be public > -- > > Key: BEAM-2581 > URL: https://issues.apache.org/jira/browse/BEAM-2581 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Affects Versions: 2.0.0 >Reporter: Nawaid Shamim >Assignee: Chamikara Jayalath >Priority: Minor > > Using Beam to read from kinesis stream. _KinesisIO_ provides two overloaded > methods - _withClientProvider_ to provide AWS credentials or implement an > interface - _KinesisClientProvider_ to pass _AWSKinesisClient_ as described > on [here|https://beam.apache.org/documentation/sdks/javadoc/2.0.0/] > {code} > There's also possibility to start reading using arbitrary point in time - in > this case you need to provide Instant object: > p.apply(KinesisIO.read() > .from("streamName", instant) > .withClientProvider(new KinesisClientProvider() { > @Override > public AmazonKinesis get() { > return null; > } > }) > .apply( ... ) // other transformations > {code} > The above code requires org.apache.beam.sdk.io.kinesis.KinesisClientProvider > interface to be public. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2750) Read whole files as one PCollection element each
[ https://issues.apache.org/jira/browse/BEAM-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2750: -- Assignee: Christopher Hebert (was: Davor Bonaci) > Read whole files as one PCollection element each > > > Key: BEAM-2750 > URL: https://issues.apache.org/jira/browse/BEAM-2750 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Christopher Hebert >Assignee: Christopher Hebert > > I'd like to read whole files as one element each. > If my input files are hi.txt, what.txt, and yes.txt, then the whole contents > of hi.txt are an element of the returned PCollection, the whole contents of > what.txt are the next element, etc., giving me a PCollection with three > elements. > This contrasts with TextIO which reads a new element for every line of text > in the input files. > This read (I'll call it WholeFileIO for now) would work like so: > {code:java} > PCollection> fileNamesAndBytes = p.apply("Read", > WholeFileIO.read().from("/path/to/input/dir/*")); > {code} > The above example passes the raw file contents and the filename. > Alternatively, we could pass a PCollection of some sort of FileWrapper around > an InputStream to support lazy loading. > This ticket complements [BEAM-2751]. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2757) Link to Jenkins view is dead
[ https://issues.apache.org/jira/browse/BEAM-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2757: -- Assignee: Jason Kuster (was: Davor Bonaci) > Link to Jenkins view is dead > > > Key: BEAM-2757 > URL: https://issues.apache.org/jira/browse/BEAM-2757 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Kenneth Knowles >Assignee: Jason Kuster > > Discovered since the dead link check is failing on all website PRs -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2709) Add TezRunner
[ https://issues.apache.org/jira/browse/BEAM-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113370#comment-16113370 ] Davor Bonaci commented on BEAM-2709: Welcome [~bdscheller], and thanks for your contribution. This is a great contribution to the project. CC: [~kenn], who will likely take it from here. > Add TezRunner > - > > Key: BEAM-2709 > URL: https://issues.apache.org/jira/browse/BEAM-2709 > Project: Beam > Issue Type: New Feature > Components: runner-ideas >Reporter: Brandon Scheller >Assignee: Brandon Scheller > > Add a TezRunner to Beam -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2709) Add TezRunner
[ https://issues.apache.org/jira/browse/BEAM-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2709: -- Assignee: Brandon Scheller (was: Davor Bonaci) > Add TezRunner > - > > Key: BEAM-2709 > URL: https://issues.apache.org/jira/browse/BEAM-2709 > Project: Beam > Issue Type: New Feature > Components: runner-ideas >Reporter: Brandon Scheller >Assignee: Brandon Scheller > > Add a TezRunner to Beam -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2341) Make our Python pom files virtualenv-compatible.
[ https://issues.apache.org/jira/browse/BEAM-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2341: -- Assignee: Ahmet Altay (was: Davor Bonaci) > Make our Python pom files virtualenv-compatible. > > > Key: BEAM-2341 > URL: https://issues.apache.org/jira/browse/BEAM-2341 > Project: Beam > Issue Type: Bug > Components: build-system >Reporter: Robert Bradshaw >Assignee: Ahmet Altay > > The importance of this may depend on whether we recommend or discourage using > mvn at all for Python, see also > https://issues.apache.org/jira/browse/BEAM-2340 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2547) FireBase IO
[ https://issues.apache.org/jira/browse/BEAM-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2547: -- Assignee: Chamikara Jayalath (was: Davor Bonaci) > FireBase IO > --- > > Key: BEAM-2547 > URL: https://issues.apache.org/jira/browse/BEAM-2547 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions, sdk-py >Reporter: Patrick Reames >Assignee: Chamikara Jayalath > > Implement IO Source for Java and Python SDKs > Work on this had previously been done for the Google Cloud DataFlow Java SDK > but was later removed. [old > code|https://github.com/GoogleCloudPlatform/DataflowJavaSDK/tree/e0e56e0911e18ad08c5f9ed245c76849503ab7c7/contrib/firebaseio/src/main/java/com/google/cloud/dataflow/contrib/firebase], > related [pull > request|https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/69] > CC: [~altay] , [~chamikara] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2608) Unclosed BoundedReader in TextIO#ReadTextFn#process()
[ https://issues.apache.org/jira/browse/BEAM-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2608: -- Assignee: Eugene Kirpichov (was: Davor Bonaci) > Unclosed BoundedReader in TextIO#ReadTextFn#process() > - > > Key: BEAM-2608 > URL: https://issues.apache.org/jira/browse/BEAM-2608 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Ted Yu >Assignee: Eugene Kirpichov >Priority: Minor > > {code} > BoundedSource.BoundedReader reader = > source > .createForSubrangeOfFile(metadata, range.getFrom(), > range.getTo()) > .createReader(c.getPipelineOptions()); > {code} > The reader should be closed upon return. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2554) failIfNoTests: causes issues when trying to run integration tests
[ https://issues.apache.org/jira/browse/BEAM-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2554: -- Assignee: (was: Davor Bonaci) > failIfNoTests: causes issues when trying to run integration tests > - > > Key: BEAM-2554 > URL: https://issues.apache.org/jira/browse/BEAM-2554 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Stephen Sisk > > Summary: > We have a couple different maven projects in beam that override the > failIfNoTests property in ways that easily affect folks that are trying to > run tests and end up having to build those projects but don't want to run > tests in those projects. > > I think we should remove these overrides since I think they do more harm than > good. The fact that they don't allow overriding by the user is particularly > tricky to get around. > Details: > projects overriding the failIfNoTests in intrusive ways are: > runners/direct/java/pom.xml (not profile protected at all) > runners/google-cloud-dataflow-java/pom.xml (when run with dataflow-runner > profile) > runners/apex/pom.xml (not profile protected) > This shows up in things like perfkitbenchmarker, where if you try to run the > default pkb command for beam, it is failing: (see below for repro recreating > what this does) > python pkb.py --benchmarks=beam_integration_benchmark > --beam_it_args=--tempRoot=gs://[bucket]/staging --beam_sdk=java > To repro: > mvn -e verify -Dit.test=org.apache.beam.examples.WordCountIT -DskipITs=false > -Pdataflow-runner > -DintegrationTestPipelineOptions=["--tempRoot=gs://sisk-test/staging","--runner=TestDataflowRunner"] > This is very reasonable command line that should work (and pkb expects it to > work.) > However, this includes a specific test (-Dit.test=..), which means it will > fail when it encounters the google-cloud-dataflow-java project. > cc [~davor] [~jasonkuster] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2544) AvroIOTest is flaky
[ https://issues.apache.org/jira/browse/BEAM-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2544: -- Assignee: Eugene Kirpichov (was: Davor Bonaci) > AvroIOTest is flaky > --- > > Key: BEAM-2544 > URL: https://issues.apache.org/jira/browse/BEAM-2544 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Alex Filatov >Assignee: Eugene Kirpichov >Priority: Minor > > "Write then read" tests randomly fail. > Steps to reproduce: > cd /runners/direct-java > mvn clean compile > mvn surefire:test@validates-runner-tests -Dtest=AvroIOTest > Repeat last step until a failure (on my machine failure rate is approx 1/3). > Example: > [ERROR] > testAvroIOWriteAndReadSchemaUpgrade(org.apache.beam.sdk.io.AvroIOTest) Time > elapsed: 0.198 s <<< ERROR! > java.lang.RuntimeException: java.io.FileNotFoundException: > /var/folders/1c/sl733g5s1g7_4mq61_qmbjx4gn/T/junit3332447750239941326/output.avro > (No such file or directory) > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:340) > at > org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302) > at > org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:201) > at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:283) > at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:340) > at > org.apache.beam.sdk.io.AvroIOTest.testAvroIOWriteAndReadSchemaUpgrade(AvroIOTest.java:275) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239) > at > org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:321) > at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) > at > org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:321) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:27) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55) > at > org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137) > at > org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107) > at > org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83) > at > org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75) > at > org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:157) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBoo
[jira] [Assigned] (BEAM-2506) Consider bundling multiple ValidatesRunner tests into one pipeline
[ https://issues.apache.org/jira/browse/BEAM-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2506: -- Assignee: (was: Davor Bonaci) > Consider bundling multiple ValidatesRunner tests into one pipeline > -- > > Key: BEAM-2506 > URL: https://issues.apache.org/jira/browse/BEAM-2506 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Eugene Kirpichov > > Currently ValidatesRunner test suites run 1 pipeline per unit test. That's a > lot of small pipelines, and consumes a lot of resources especially in case of > a pretty heavyweight runner like Dataflow, so tests take a long time and > can't be run in parallel due to quota issues, etc. > [~jasonkuster] says he and [~davor] discussed that we could execute multiple > unit tests in a single TestPipeline. > This JIRA is to track that idea. > To further develop it: in case of Java, we could create a custom JUnit Runner > http://junit.org/junit4/javadoc/4.12/org/junit/runner/Runner.html that would > apply all the transforms and PAsserts in unit tests to a single instance of > TestPipeline (per class, rather than per method), and run the whole thing at > the end. PAssert captures the source location of its application, so we could > still report which particular test failed. > This obviously has fewer isolation between unit test methods, cause they > effectively run in parallel instead of in sequence, so things like per-method > setup and teardown will no longer be applicable. There'll probably be other > issues. > Anyway, this seems doable and high-impact. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2493) TestStream.Builder.addElements() should return the same builder
[ https://issues.apache.org/jira/browse/BEAM-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2493: -- Assignee: Thomas Groh (was: Davor Bonaci) > TestStream.Builder.addElements() should return the same builder > --- > > Key: BEAM-2493 > URL: https://issues.apache.org/jira/browse/BEAM-2493 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.0.0 >Reporter: Keith Berkoben >Assignee: Thomas Groh > > When writing tests for pipelines, it is commonly the case where a TestStream > must be built in steps ex: > TestStream.Builder tsb = > TestStream.create().advanceWatermarkTo(new Instant(0); > if(){ > tsb.addElements(); > } > TestStream stream = tsb.advanceWatermarkToInfinity(); > The above code does not work, however, because addElements() is creating a > NEW builder rather than augmenting the existing one. This is a-typical for a > builder pattern and requires the user to do > tsb = tsb.addElements() > which is more verbose and counterintuitive if one is expecting a builder. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2467) KinesisIO watermark based on approximateArrivalTimestamp
[ https://issues.apache.org/jira/browse/BEAM-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2467: -- Assignee: Paweł Kaczmarczyk > KinesisIO watermark based on approximateArrivalTimestamp > > > Key: BEAM-2467 > URL: https://issues.apache.org/jira/browse/BEAM-2467 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: Paweł Kaczmarczyk >Assignee: Paweł Kaczmarczyk > > In Kinesis we can start reading the stream at some point in the past during > the retention period (up to 7 days). With current approach for setting > record's timestamp and watermark (both are always set to current time, i.e. > Instant.now()), we can't observe the actual position in the stream. > So the idea is to change this behaviour and set the record timestamp based on > the > [ApproximateArrivalTimestamp|http://docs.aws.amazon.com/kinesis/latest/APIReference/API_Record.html#Streams-Type-Record-ApproximateArrivalTimestamp]. > Watermark will be set accordingly to the last read record's timestamp. > ApproximateArrivalTimestamp is still some approximation and may result in > having records with out-of-order timestamp's which in turn may result in some > events marked as late. This however should not be a frequent issue and even > if it happens it should be a matter of milliseconds or seconds so can be > handled even with a tiny allowedLateness setting -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2468) Reading Kinesis records in the background
[ https://issues.apache.org/jira/browse/BEAM-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2468: -- Assignee: Paweł Kaczmarczyk > Reading Kinesis records in the background > - > > Key: BEAM-2468 > URL: https://issues.apache.org/jira/browse/BEAM-2468 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: Paweł Kaczmarczyk >Assignee: Paweł Kaczmarczyk > > Currently Kinesis records are read on demand in a runner's thread. We may > instead read the records in a background with separate threads and store time > in the buffer which will result in a major performance improvement. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2469) Handling Kinesis shards splits and merges
[ https://issues.apache.org/jira/browse/BEAM-2469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davor Bonaci reassigned BEAM-2469: -- Assignee: Paweł Kaczmarczyk > Handling Kinesis shards splits and merges > - > > Key: BEAM-2469 > URL: https://issues.apache.org/jira/browse/BEAM-2469 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: Paweł Kaczmarczyk >Assignee: Paweł Kaczmarczyk > > Kinesis stream consists of > [shards|http://docs.aws.amazon.com/streams/latest/dev/key-concepts.html#shard] > that allow for capacity scaling. In order to increase/decrease the capacity > shards have to be split/merged together. Such operations are currently not > handled properly and will end with errors. -- This message was sent by Atlassian JIRA (v6.4.14#64029)