[jira] [Assigned] (BEAM-5061) Invisible parameter type exception in JDK 10

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5061:
-

Assignee: (was: Kenneth Knowles)

> Invisible parameter type exception in JDK 10
> 
>
> Key: BEAM-5061
> URL: https://issues.apache.org/jira/browse/BEAM-5061
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Mike Pedersen
>Priority: Major
>
> When using JDK 10, using a ParDo after a CoGroupByKey seems to create the 
> following exception when executed on local runner:
> {noformat}
> Exception in thread "main" 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalStateException: Invisible parameter type of Main$1 arg0 for 
> public Main$1$DoFnInvoker(Main$1)
> at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2214)
> at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.cache.LocalCache.get(LocalCache.java:4053)
> at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)
> ...
> Caused by: java.lang.IllegalStateException: Invisible parameter type of 
> Main$1 arg0 for public Main$1$DoFnInvoker(Main$1)
> at 
> org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.dynamic.scaffold.InstrumentedType$Default.validated(InstrumentedType.java:925)
> at 
> org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.dynamic.scaffold.MethodRegistry$Default.prepare(MethodRegistry.java:465)
> at 
> org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.dynamic.scaffold.subclass.SubclassDynamicTypeBuilder.make(SubclassDynamicTypeBuilder.java:170)
> ...
> {noformat}
> This error disappears completely when using JDK 8. Here is a minimal example 
> to reproduce it:
> {code:java}
> import org.apache.beam.sdk.Pipeline;
> import org.apache.beam.sdk.options.PipelineOptions;
> import org.apache.beam.sdk.options.PipelineOptionsFactory;
> import org.apache.beam.sdk.transforms.Create;
> import org.apache.beam.sdk.transforms.DoFn;
> import org.apache.beam.sdk.transforms.ParDo;
> import org.apache.beam.sdk.transforms.join.CoGbkResult;
> import org.apache.beam.sdk.transforms.join.CoGroupByKey;
> import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
> import org.apache.beam.sdk.values.KV;
> import org.apache.beam.sdk.values.PCollection;
> import org.apache.beam.sdk.values.TupleTag;
> import java.util.Arrays;
> import java.util.List;
> public class Main {
>     public static void main(String[] args) {
>     PipelineOptions options = PipelineOptionsFactory.create();
>     Pipeline p = Pipeline.create(options);
>     final TupleTag emailsTag = new TupleTag<>();
>     final TupleTag phonesTag = new TupleTag<>();
>     final List> emailsList =
>     Arrays.asList(
>     KV.of("amy", "a...@example.com"),
>     KV.of("carl", "c...@example.com"),
>     KV.of("julia", "ju...@example.com"),
>     KV.of("carl", "c...@email.com"));
>     final List> phonesList =
>     Arrays.asList(
>     KV.of("amy", "111-222-"),
>     KV.of("james", "222-333-"),
>     KV.of("amy", "333-444-"),
>     KV.of("carl", "444-555-"));
>     PCollection> emails = p.apply("CreateEmails", 
> Create.of(emailsList));
>     PCollection> phones = p.apply("CreatePhones", 
> Create.of(phonesList));
>     PCollection> results =
>     KeyedPCollectionTuple.of(emailsTag, emails)
>     .and(phonesTag, phones)
>     .apply(CoGroupByKey.create());
>     PCollection contactLines =
>     results.apply(
>     ParDo.of(
>     new DoFn, String>() {
>     @ProcessElement
>     public void processElement(ProcessContext 
> c) {
>     KV e = 
> c.element();
>     String name = e.getKey();
>     Iterable emailsIter = 
> e.getValue().getAll(emailsTag);
>     Iterable phonesIter = 
> e.getValue().getAll(phonesTag);
>     String formattedResult = "";
>     c.output(formattedResult);
>     }
>     }));
>     

[jira] [Assigned] (BEAM-5635) FR: Enable Transactional writes with DatastoreIO

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5635:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> FR: Enable Transactional writes with DatastoreIO
> 
>
> Key: BEAM-5635
> URL: https://issues.apache.org/jira/browse/BEAM-5635
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Alex Amato
>Assignee: Chamikara Jayalath
>Priority: Major
>
> I have seen a user who would like to use Datastore Transactions to rollback a 
> set of records if one of them fails to write. Let's consider this use case 
> for DatastoreIO



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5635) FR: Enable Transactional writes with DatastoreIO

2018-10-30 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669545#comment-16669545
 ] 

Kenneth Knowles commented on BEAM-5635:
---

Thanks for the report. Sending to component owner for prioritization and triage.

> FR: Enable Transactional writes with DatastoreIO
> 
>
> Key: BEAM-5635
> URL: https://issues.apache.org/jira/browse/BEAM-5635
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Alex Amato
>Assignee: Chamikara Jayalath
>Priority: Major
>
> I have seen a user who would like to use Datastore Transactions to rollback a 
> set of records if one of them fails to write. Let's consider this use case 
> for DatastoreIO



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5664) A canceled pipeline should not return a done status in the jobserver.

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5664:
--
Component/s: (was: runner-core)
 runner-flink

> A canceled pipeline should not return a done status in the jobserver.
> -
>
> Key: BEAM-5664
> URL: https://issues.apache.org/jira/browse/BEAM-5664
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Robert Bradshaw
>Assignee: Kenneth Knowles
>Priority: Major
>  Labels: portability-flink
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5664) A canceled pipeline should not return a done status in the jobserver.

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5664:
-

Assignee: Maximilian Michels  (was: Kenneth Knowles)

> A canceled pipeline should not return a done status in the jobserver.
> -
>
> Key: BEAM-5664
> URL: https://issues.apache.org/jira/browse/BEAM-5664
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Robert Bradshaw
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability-flink
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5434) Issue with BigQueryIO in Template

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5434:
-

Assignee: Chamikara Jayalath  (was: Kenneth Knowles)

> Issue with BigQueryIO in Template
> -
>
> Key: BEAM-5434
> URL: https://issues.apache.org/jira/browse/BEAM-5434
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Amarendra Kumar
>Assignee: Chamikara Jayalath
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I am trying to build a google Dataflow template to be run from a cloud 
> function.
> The issue is with BigQueryIO trying execute a SQL.
> The opening step for my Dataflow Template is
> {code:java}
> BigQueryIO.readTableRows().withQueryLocation("US").withoutValidation().fromQuery(options.getSql()).usingStandardSql()
> {code}
> When the template is triggered for the first time its running fine.
> But when its triggered for the second time, it fails with the following error.
> {code}
> // Some comments here
> java.io.FileNotFoundException: No files matched spec: 
> gs://test-notification/temp/Notification/BigQueryExtractTemp/34d42a122600416c9ea748a6e325f87a/.avro
>   at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>   at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>   at 
> org.apache.beam.sdk.io.FileBasedSource.createReader(FileBasedSource.java:329)
>   at 
> com.google.cloud.dataflow.worker.WorkerCustomSources$1.iterator(WorkerCustomSources.java:360)
>   at 
> com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:177)
>   at 
> com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
>   at 
> com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
>   at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391)
>   at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360)
>   at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288)
>   at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
>   at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
>   at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> In the second run, why is the process expecting a file in the GCS location?
> This file does get created while the job is running at the first run, but it 
> also gets deleted after the job is complete. 
> How are the two jobs related?
>  Could you please let me know if I am missing something or this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5434) Issue with BigQueryIO in Template

2018-10-30 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669558#comment-16669558
 ] 

Kenneth Knowles commented on BEAM-5434:
---

Sorry for the delay. I have assigned this to the component owner for 
BigQueryIO. The BigQueryIO transform first issues a BigQuery export job to Avro 
and then reads the resulting files. So that is the file that cannot be found.

> Issue with BigQueryIO in Template
> -
>
> Key: BEAM-5434
> URL: https://issues.apache.org/jira/browse/BEAM-5434
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Amarendra Kumar
>Assignee: Chamikara Jayalath
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I am trying to build a google Dataflow template to be run from a cloud 
> function.
> The issue is with BigQueryIO trying execute a SQL.
> The opening step for my Dataflow Template is
> {code:java}
> BigQueryIO.readTableRows().withQueryLocation("US").withoutValidation().fromQuery(options.getSql()).usingStandardSql()
> {code}
> When the template is triggered for the first time its running fine.
> But when its triggered for the second time, it fails with the following error.
> {code}
> // Some comments here
> java.io.FileNotFoundException: No files matched spec: 
> gs://test-notification/temp/Notification/BigQueryExtractTemp/34d42a122600416c9ea748a6e325f87a/.avro
>   at 
> org.apache.beam.sdk.io.FileSystems.maybeAdjustEmptyMatchResult(FileSystems.java:172)
>   at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:158)
>   at 
> org.apache.beam.sdk.io.FileBasedSource.createReader(FileBasedSource.java:329)
>   at 
> com.google.cloud.dataflow.worker.WorkerCustomSources$1.iterator(WorkerCustomSources.java:360)
>   at 
> com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:177)
>   at 
> com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:158)
>   at 
> com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:75)
>   at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:391)
>   at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:360)
>   at 
> com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:288)
>   at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
>   at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
>   at 
> com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> In the second run, why is the process expecting a file in the GCS location?
> This file does get created while the job is running at the first run, but it 
> also gets deleted after the job is complete. 
> How are the two jobs related?
>  Could you please let me know if I am missing something or this is a bug?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5823) Vendor commons

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5823:
-

Assignee: (was: Kenneth Knowles)

> Vendor commons
> --
>
> Key: BEAM-5823
> URL: https://issues.apache.org/jira/browse/BEAM-5823
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5825) Vendor kryo

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5825:
-

Assignee: (was: Kenneth Knowles)

> Vendor kryo
> ---
>
> Key: BEAM-5825
> URL: https://issues.apache.org/jira/browse/BEAM-5825
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5810) Move Jenkins notifications to bui...@beam.apache.org

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-5810.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Move Jenkins notifications to bui...@beam.apache.org
> 
>
> Key: BEAM-5810
> URL: https://issues.apache.org/jira/browse/BEAM-5810
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> On list there is consensus to move them off commits@ and dev@ to a new list, 
> builds@.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5824) Vendor protobuf-java

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5824:
-

Assignee: (was: Kenneth Knowles)

> Vendor protobuf-java
> 
>
> Key: BEAM-5824
> URL: https://issues.apache.org/jira/browse/BEAM-5824
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3608) Vendor Guava

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-3608:
-

Assignee: Kenneth Knowles

> Vendor Guava
> 
>
> Key: BEAM-3608
> URL: https://issues.apache.org/jira/browse/BEAM-3608
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Instead of shading as part of our build, we can shade before build so that it 
> is apparent when reading code, and in IDEs, that a particular class resides 
> in a hidden namespace.
> {{import com.google.common.reflect.TypeToken}}
> becomes something like
> {{import org.apache.beam.private.guava21.com.google.common.reflect.TypeToken}}
> So we can very trivially ban `org.apache.beam.private` from public APIs 
> unless they are annotated {{@Internal}}, and it makes sharing between our own 
> modules never get broken by shading again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5673) View.asMap on non-KV PCollection fails at runtime, not construction/submission time

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5673:
-

Assignee: (was: Kenneth Knowles)

> View.asMap on non-KV PCollection fails at runtime, not 
> construction/submission time
> ---
>
> Key: BEAM-5673
> URL: https://issues.apache.org/jira/browse/BEAM-5673
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Przemyslaw Pastuszka
>Priority: Major
>
> I'm trying to write a ParDo, which will use both Timer and Side Input, but it 
> crashes when I try to run it with {{beam-runners-direct-java}} with 
> {{IllegalArgumentException}} on a line 
> [https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/QuiescenceDriver.java#L167],
>  because there are actually two inputs to ParDo (main PCollection and side 
> input), while only one is expected. It looks like a bug in an implementation.
>  
> Here's the code that reproduces the issue:
> {code:java}
> public class TestCrashesForTimerAndSideInput {
> @Rule
> public final transient TestPipeline p = TestPipeline.create();
> private static class DoFnWithTimer extends DoFn, 
> String> {
> @TimerId("t")
> private final TimerSpec tSpec = 
> TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
> private final PCollectionView> sideInput;
> private DoFnWithTimer(PCollectionView> sideInput) 
> {
> this.sideInput = sideInput;
> }
> @ProcessElement
> public void processElement(ProcessContext c, @TimerId("t") Timer t) {
> KV element = c.element();
> c.output(element.getKey() + c.sideInput(sideInput).get(element));
> t.offset(Duration.standardSeconds(1)).setRelative();
> }
> @OnTimer("t")
> public void onTimerFire(OnTimerContext x) {
> x.output("Timer fired");
> }
> }
> @Test
> public void testCrashesForTimerAndSideInput() {
> ImmutableMap sideData = ImmutableMap. String>builder().
> put("x", "X").
> put("y", "Y").
> build();
> PCollectionView> sideInput =
> p.apply(Create.of(sideData)).apply(View.asMap());
> TestStream testStream = 
> TestStream.create(StringUtf8Coder.of()).
> addElements("x").
> advanceProcessingTime(Duration.standardSeconds(1)).
> addElements("y").
> advanceProcessingTime(Duration.standardSeconds(1)).
> advanceWatermarkToInfinity();
> PCollection result = p.
> apply(testStream).
> apply(MapElements.into(kvs(strings(), strings())).via(v -> 
> KV.of(v, v))).
> apply(ParDo.of(new 
> DoFnWithTimer(sideInput)).withSideInputs(sideInput));
> PAssert.that(result).containsInAnyOrder("xX", "yY", "Timer fired");
> p.run();
> }
> }
> {code}
>  
> and the error is:
> {code}
> java.lang.IllegalArgumentException: expected one element but was: 
>  KeyedWorkItem/ParMultiDo(ToKeyedWorkItem).output [PCollection], 
> View.AsMap/View.VoidKeyToMultimapMaterialization/ParDo(VoidKeyToMultimapMaterialization)/ParMultiDo(VoidKeyToMultimapMaterialization).output
>  [PCollection]>
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:322)
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:294)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.fireTimers(QuiescenceDriver.java:167)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.drive(QuiescenceDriver.java:110)
>   at 
> org.apache.beam.runners.direct.ExecutorServiceParallelExecutor$2.run(ExecutorServiceParallelExecutor.java:170)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5673) Direct java runner crashes when using both timers and side input

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5673:
--
Component/s: (was: beam-model)
 sdk-java-core

> Direct java runner crashes when using both timers and side input
> 
>
> Key: BEAM-5673
> URL: https://issues.apache.org/jira/browse/BEAM-5673
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Przemyslaw Pastuszka
>Assignee: Kenneth Knowles
>Priority: Major
>
> I'm trying to write a ParDo, which will use both Timer and Side Input, but it 
> crashes when I try to run it with {{beam-runners-direct-java}} with 
> {{IllegalArgumentException}} on a line 
> [https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/QuiescenceDriver.java#L167],
>  because there are actually two inputs to ParDo (main PCollection and side 
> input), while only one is expected. It looks like a bug in an implementation.
>  
> Here's the code that reproduces the issue:
> {code:java}
> public class TestCrashesForTimerAndSideInput {
> @Rule
> public final transient TestPipeline p = TestPipeline.create();
> private static class DoFnWithTimer extends DoFn, 
> String> {
> @TimerId("t")
> private final TimerSpec tSpec = 
> TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
> private final PCollectionView> sideInput;
> private DoFnWithTimer(PCollectionView> sideInput) 
> {
> this.sideInput = sideInput;
> }
> @ProcessElement
> public void processElement(ProcessContext c, @TimerId("t") Timer t) {
> KV element = c.element();
> c.output(element.getKey() + c.sideInput(sideInput).get(element));
> t.offset(Duration.standardSeconds(1)).setRelative();
> }
> @OnTimer("t")
> public void onTimerFire(OnTimerContext x) {
> x.output("Timer fired");
> }
> }
> @Test
> public void testCrashesForTimerAndSideInput() {
> ImmutableMap sideData = ImmutableMap. String>builder().
> put("x", "X").
> put("y", "Y").
> build();
> PCollectionView> sideInput =
> p.apply(Create.of(sideData)).apply(View.asMap());
> TestStream testStream = 
> TestStream.create(StringUtf8Coder.of()).
> addElements("x").
> advanceProcessingTime(Duration.standardSeconds(1)).
> addElements("y").
> advanceProcessingTime(Duration.standardSeconds(1)).
> advanceWatermarkToInfinity();
> PCollection result = p.
> apply(testStream).
> apply(MapElements.into(kvs(strings(), strings())).via(v -> 
> KV.of(v, v))).
> apply(ParDo.of(new 
> DoFnWithTimer(sideInput)).withSideInputs(sideInput));
> PAssert.that(result).containsInAnyOrder("xX", "yY", "Timer fired");
> p.run();
> }
> }
> {code}
>  
> and the error is:
> {code}
> java.lang.IllegalArgumentException: expected one element but was: 
>  KeyedWorkItem/ParMultiDo(ToKeyedWorkItem).output [PCollection], 
> View.AsMap/View.VoidKeyToMultimapMaterialization/ParDo(VoidKeyToMultimapMaterialization)/ParMultiDo(VoidKeyToMultimapMaterialization).output
>  [PCollection]>
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:322)
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:294)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.fireTimers(QuiescenceDriver.java:167)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.drive(QuiescenceDriver.java:110)
>   at 
> org.apache.beam.runners.direct.ExecutorServiceParallelExecutor$2.run(ExecutorServiceParallelExecutor.java:170)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5773:
--
Component/s: (was: build-system)
 test-failures

> Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for 
> the Java Runtime Environment to continue."
> --
>
> Key: BEAM-5773
> URL: https://issues.apache.org/jira/browse/BEAM-5773
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>
> Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
> Gradle allocate a thread.
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]
> Likely transient, but filing this to track if that is the case.
>  {code}
> 15:07:52 [src] $ 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
>  --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
> -Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
> :beam-sdks-python:validatesRunnerStreamingTests
> 15:07:52 #
> 15:07:52 # There is insufficient memory for the Java Runtime Environment to 
> continue.
> 15:07:52 # Cannot create GC thread. Out of system resources.
> 15:07:52 # An error report file with more information is saved as:
> 15:07:52 # 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
> 15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
> 15:07:53 Build step 'Invoke Gradle script' marked build as failure
> 15:07:56 Sending e-mails to: comm...@beam.apache.org
> 15:07:57 No emails were triggered.
> 15:07:57 Finished: FAILURE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5061) Invisible parameter type exception in JDK 10

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5061:
--
Component/s: (was: beam-model)
 sdk-java-core

> Invisible parameter type exception in JDK 10
> 
>
> Key: BEAM-5061
> URL: https://issues.apache.org/jira/browse/BEAM-5061
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Affects Versions: 2.5.0
>Reporter: Mike Pedersen
>Assignee: Kenneth Knowles
>Priority: Major
>
> When using JDK 10, using a ParDo after a CoGroupByKey seems to create the 
> following exception when executed on local runner:
> {noformat}
> Exception in thread "main" 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.lang.IllegalStateException: Invisible parameter type of Main$1 arg0 for 
> public Main$1$DoFnInvoker(Main$1)
> at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2214)
> at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.cache.LocalCache.get(LocalCache.java:4053)
> at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)
> ...
> Caused by: java.lang.IllegalStateException: Invisible parameter type of 
> Main$1 arg0 for public Main$1$DoFnInvoker(Main$1)
> at 
> org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.dynamic.scaffold.InstrumentedType$Default.validated(InstrumentedType.java:925)
> at 
> org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.dynamic.scaffold.MethodRegistry$Default.prepare(MethodRegistry.java:465)
> at 
> org.apache.beam.repackaged.beam_sdks_java_core.net.bytebuddy.dynamic.scaffold.subclass.SubclassDynamicTypeBuilder.make(SubclassDynamicTypeBuilder.java:170)
> ...
> {noformat}
> This error disappears completely when using JDK 8. Here is a minimal example 
> to reproduce it:
> {code:java}
> import org.apache.beam.sdk.Pipeline;
> import org.apache.beam.sdk.options.PipelineOptions;
> import org.apache.beam.sdk.options.PipelineOptionsFactory;
> import org.apache.beam.sdk.transforms.Create;
> import org.apache.beam.sdk.transforms.DoFn;
> import org.apache.beam.sdk.transforms.ParDo;
> import org.apache.beam.sdk.transforms.join.CoGbkResult;
> import org.apache.beam.sdk.transforms.join.CoGroupByKey;
> import org.apache.beam.sdk.transforms.join.KeyedPCollectionTuple;
> import org.apache.beam.sdk.values.KV;
> import org.apache.beam.sdk.values.PCollection;
> import org.apache.beam.sdk.values.TupleTag;
> import java.util.Arrays;
> import java.util.List;
> public class Main {
>     public static void main(String[] args) {
>     PipelineOptions options = PipelineOptionsFactory.create();
>     Pipeline p = Pipeline.create(options);
>     final TupleTag emailsTag = new TupleTag<>();
>     final TupleTag phonesTag = new TupleTag<>();
>     final List> emailsList =
>     Arrays.asList(
>     KV.of("amy", "a...@example.com"),
>     KV.of("carl", "c...@example.com"),
>     KV.of("julia", "ju...@example.com"),
>     KV.of("carl", "c...@email.com"));
>     final List> phonesList =
>     Arrays.asList(
>     KV.of("amy", "111-222-"),
>     KV.of("james", "222-333-"),
>     KV.of("amy", "333-444-"),
>     KV.of("carl", "444-555-"));
>     PCollection> emails = p.apply("CreateEmails", 
> Create.of(emailsList));
>     PCollection> phones = p.apply("CreatePhones", 
> Create.of(phonesList));
>     PCollection> results =
>     KeyedPCollectionTuple.of(emailsTag, emails)
>     .and(phonesTag, phones)
>     .apply(CoGroupByKey.create());
>     PCollection contactLines =
>     results.apply(
>     ParDo.of(
>     new DoFn, String>() {
>     @ProcessElement
>     public void processElement(ProcessContext 
> c) {
>     KV e = 
> c.element();
>     String name = e.getKey();
>     Iterable emailsIter = 
> e.getValue().getAll(emailsTag);
>     Iterable phonesIter = 
> e.getValue().getAll(phonesTag);
>     String formattedResult = "";
>     c.output(formattedResult);
>     }
>  

[jira] [Resolved] (BEAM-5773) Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for the Java Runtime Environment to continue."

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-5773.
---
   Resolution: Cannot Reproduce
Fix Version/s: Not applicable

> Failure in beam_PostCommit_Py_VR_Dataflow "There is insufficient memory for 
> the Java Runtime Environment to continue."
> --
>
> Key: BEAM-5773
> URL: https://issues.apache.org/jira/browse/BEAM-5773
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>
> Jenkins failed on the Python Dataflow ValidatesRunner postcommit because it 
> Gradle allocate a thread.
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/1402/console]
> Likely transient, but filing this to track if that is the case.
>  {code}
> 15:07:52 [src] $ 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/gradlew
>  --info --continue --max-workers=12 -Dorg.gradle.jvmargs=-Xms2g 
> -Dorg.gradle.jvmargs=-Xmx4g :beam-sdks-python:validatesRunnerBatchTests 
> :beam-sdks-python:validatesRunnerStreamingTests
> 15:07:52 #
> 15:07:52 # There is insufficient memory for the Java Runtime Environment to 
> continue.
> 15:07:52 # Cannot create GC thread. Out of system resources.
> 15:07:52 # An error report file with more information is saved as:
> 15:07:52 # 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_VR_Dataflow/src/hs_err_pid31336.log
> 15:07:53 Build step 'Invoke Gradle script' changed build result to FAILURE
> 15:07:53 Build step 'Invoke Gradle script' marked build as failure
> 15:07:56 Sending e-mails to: comm...@beam.apache.org
> 15:07:57 No emails were triggered.
> 15:07:57 Finished: FAILURE
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5501) Interactive Beam display issue

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5501:
-

Assignee: Robert Bradshaw  (was: Kenneth Knowles)

> Interactive Beam display issue
> --
>
> Key: BEAM-5501
> URL: https://issues.apache.org/jira/browse/BEAM-5501
> Project: Beam
>  Issue Type: Bug
>  Components: runner-ideas
>Reporter: Qinye Li
>Assignee: Robert Bradshaw
>Priority: Trivial
> Attachments: 45650665-e0374200-ba83-11e8-9425-a2b5de9aa455.png
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The number of PTransform executed is wrongly displayed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5605) Support Portable SplittableDoFn for batch

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5605:
-

Assignee: Scott Wegner  (was: Kenneth Knowles)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: portability
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5605) Support Portable SplittableDoFn for batch

2018-10-30 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669546#comment-16669546
 ] 

Kenneth Knowles commented on BEAM-5605:
---

If you are tracking, OK to assign to you?

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Major
>  Labels: portability
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5673) Direct java runner crashes when using both timers and side input

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5673:
-

Assignee: Kenneth Knowles  (was: Daniel Oliveira)

> Direct java runner crashes when using both timers and side input
> 
>
> Key: BEAM-5673
> URL: https://issues.apache.org/jira/browse/BEAM-5673
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Przemyslaw Pastuszka
>Assignee: Kenneth Knowles
>Priority: Major
>
> I'm trying to write a ParDo, which will use both Timer and Side Input, but it 
> crashes when I try to run it with {{beam-runners-direct-java}} with 
> {{IllegalArgumentException}} on a line 
> [https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/QuiescenceDriver.java#L167],
>  because there are actually two inputs to ParDo (main PCollection and side 
> input), while only one is expected. It looks like a bug in an implementation.
>  
> Here's the code that reproduces the issue:
> {code:java}
> public class TestCrashesForTimerAndSideInput {
> @Rule
> public final transient TestPipeline p = TestPipeline.create();
> private static class DoFnWithTimer extends DoFn, 
> String> {
> @TimerId("t")
> private final TimerSpec tSpec = 
> TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
> private final PCollectionView> sideInput;
> private DoFnWithTimer(PCollectionView> sideInput) 
> {
> this.sideInput = sideInput;
> }
> @ProcessElement
> public void processElement(ProcessContext c, @TimerId("t") Timer t) {
> KV element = c.element();
> c.output(element.getKey() + c.sideInput(sideInput).get(element));
> t.offset(Duration.standardSeconds(1)).setRelative();
> }
> @OnTimer("t")
> public void onTimerFire(OnTimerContext x) {
> x.output("Timer fired");
> }
> }
> @Test
> public void testCrashesForTimerAndSideInput() {
> ImmutableMap sideData = ImmutableMap. String>builder().
> put("x", "X").
> put("y", "Y").
> build();
> PCollectionView> sideInput =
> p.apply(Create.of(sideData)).apply(View.asMap());
> TestStream testStream = 
> TestStream.create(StringUtf8Coder.of()).
> addElements("x").
> advanceProcessingTime(Duration.standardSeconds(1)).
> addElements("y").
> advanceProcessingTime(Duration.standardSeconds(1)).
> advanceWatermarkToInfinity();
> PCollection result = p.
> apply(testStream).
> apply(MapElements.into(kvs(strings(), strings())).via(v -> 
> KV.of(v, v))).
> apply(ParDo.of(new 
> DoFnWithTimer(sideInput)).withSideInputs(sideInput));
> PAssert.that(result).containsInAnyOrder("xX", "yY", "Timer fired");
> p.run();
> }
> }
> {code}
>  
> and the error is:
> {code}
> java.lang.IllegalArgumentException: expected one element but was: 
>  KeyedWorkItem/ParMultiDo(ToKeyedWorkItem).output [PCollection], 
> View.AsMap/View.VoidKeyToMultimapMaterialization/ParDo(VoidKeyToMultimapMaterialization)/ParMultiDo(VoidKeyToMultimapMaterialization).output
>  [PCollection]>
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:322)
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:294)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.fireTimers(QuiescenceDriver.java:167)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.drive(QuiescenceDriver.java:110)
>   at 
> org.apache.beam.runners.direct.ExecutorServiceParallelExecutor$2.run(ExecutorServiceParallelExecutor.java:170)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5673) View.asMap on non-KV PCollection fails at runtime, not construction/submission time

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5673:
--
Summary: View.asMap on non-KV PCollection fails at runtime, not 
construction/submission time  (was: Direct java runner crashes when using both 
timers and side input)

> View.asMap on non-KV PCollection fails at runtime, not 
> construction/submission time
> ---
>
> Key: BEAM-5673
> URL: https://issues.apache.org/jira/browse/BEAM-5673
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Przemyslaw Pastuszka
>Assignee: Kenneth Knowles
>Priority: Major
>
> I'm trying to write a ParDo, which will use both Timer and Side Input, but it 
> crashes when I try to run it with {{beam-runners-direct-java}} with 
> {{IllegalArgumentException}} on a line 
> [https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/QuiescenceDriver.java#L167],
>  because there are actually two inputs to ParDo (main PCollection and side 
> input), while only one is expected. It looks like a bug in an implementation.
>  
> Here's the code that reproduces the issue:
> {code:java}
> public class TestCrashesForTimerAndSideInput {
> @Rule
> public final transient TestPipeline p = TestPipeline.create();
> private static class DoFnWithTimer extends DoFn, 
> String> {
> @TimerId("t")
> private final TimerSpec tSpec = 
> TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
> private final PCollectionView> sideInput;
> private DoFnWithTimer(PCollectionView> sideInput) 
> {
> this.sideInput = sideInput;
> }
> @ProcessElement
> public void processElement(ProcessContext c, @TimerId("t") Timer t) {
> KV element = c.element();
> c.output(element.getKey() + c.sideInput(sideInput).get(element));
> t.offset(Duration.standardSeconds(1)).setRelative();
> }
> @OnTimer("t")
> public void onTimerFire(OnTimerContext x) {
> x.output("Timer fired");
> }
> }
> @Test
> public void testCrashesForTimerAndSideInput() {
> ImmutableMap sideData = ImmutableMap. String>builder().
> put("x", "X").
> put("y", "Y").
> build();
> PCollectionView> sideInput =
> p.apply(Create.of(sideData)).apply(View.asMap());
> TestStream testStream = 
> TestStream.create(StringUtf8Coder.of()).
> addElements("x").
> advanceProcessingTime(Duration.standardSeconds(1)).
> addElements("y").
> advanceProcessingTime(Duration.standardSeconds(1)).
> advanceWatermarkToInfinity();
> PCollection result = p.
> apply(testStream).
> apply(MapElements.into(kvs(strings(), strings())).via(v -> 
> KV.of(v, v))).
> apply(ParDo.of(new 
> DoFnWithTimer(sideInput)).withSideInputs(sideInput));
> PAssert.that(result).containsInAnyOrder("xX", "yY", "Timer fired");
> p.run();
> }
> }
> {code}
>  
> and the error is:
> {code}
> java.lang.IllegalArgumentException: expected one element but was: 
>  KeyedWorkItem/ParMultiDo(ToKeyedWorkItem).output [PCollection], 
> View.AsMap/View.VoidKeyToMultimapMaterialization/ParDo(VoidKeyToMultimapMaterialization)/ParMultiDo(VoidKeyToMultimapMaterialization).output
>  [PCollection]>
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:322)
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:294)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.fireTimers(QuiescenceDriver.java:167)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.drive(QuiescenceDriver.java:110)
>   at 
> org.apache.beam.runners.direct.ExecutorServiceParallelExecutor$2.run(ExecutorServiceParallelExecutor.java:170)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5845) Precommit time (1h15m) includes 20m of Dataflow integration tests

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-5845.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Precommit time (1h15m) includes 20m of Dataflow integration tests
> -
>
> Key: BEAM-5845
> URL: https://issues.apache.org/jira/browse/BEAM-5845
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5635) FR: Enable Transactional writes with DatastoreIO

2018-10-30 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5635:
--
Component/s: (was: sdk-java-core)
 io-java-gcp

> FR: Enable Transactional writes with DatastoreIO
> 
>
> Key: BEAM-5635
> URL: https://issues.apache.org/jira/browse/BEAM-5635
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Alex Amato
>Assignee: Kenneth Knowles
>Priority: Major
>
> I have seen a user who would like to use Datastore Transactions to rollback a 
> set of records if one of them fails to write. Let's consider this use case 
> for DatastoreIO



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5673) Direct java runner crashes when using both timers and side input

2018-10-30 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669543#comment-16669543
 ] 

Kenneth Knowles commented on BEAM-5673:
---

Thanks for the extremely detailed reproduction case! This is stellar. Because 
you have done so much work, I can immediately identify a problem -

{code:java}
ImmutableMap sideData = ImmutableMap.builder().
put("x", "X").
put("y", "Y").
build();

PCollectionView> sideInput =
  p.apply(Create.of(sideData)).apply(View.asMap());
{code}

This should have failed at construction time. It is a bug that our validation 
did not give a good error message. The expression {{Create.of(sideData)}} has 
type {{PCollection>}}. But to "view a PCollection as a map" 
it should be a PCollection of KV pairs, so the type should be 
{{PCollection>}}. You can either use {{View.asSingleton()}} 
to grab just the one map you put in the collection, or you can change how you 
initialize and do

{code:java}

PCollectionView> sideInput =
  p.apply(Create.of(KV.of("x", "X"), KV.of("y", "Y"))).apply(View.asMap());
{code}



> Direct java runner crashes when using both timers and side input
> 
>
> Key: BEAM-5673
> URL: https://issues.apache.org/jira/browse/BEAM-5673
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.6.0, 2.7.0
>Reporter: Przemyslaw Pastuszka
>Assignee: Kenneth Knowles
>Priority: Major
>
> I'm trying to write a ParDo, which will use both Timer and Side Input, but it 
> crashes when I try to run it with {{beam-runners-direct-java}} with 
> {{IllegalArgumentException}} on a line 
> [https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/QuiescenceDriver.java#L167],
>  because there are actually two inputs to ParDo (main PCollection and side 
> input), while only one is expected. It looks like a bug in an implementation.
>  
> Here's the code that reproduces the issue:
> {code:java}
> public class TestCrashesForTimerAndSideInput {
> @Rule
> public final transient TestPipeline p = TestPipeline.create();
> private static class DoFnWithTimer extends DoFn, 
> String> {
> @TimerId("t")
> private final TimerSpec tSpec = 
> TimerSpecs.timer(TimeDomain.PROCESSING_TIME);
> private final PCollectionView> sideInput;
> private DoFnWithTimer(PCollectionView> sideInput) 
> {
> this.sideInput = sideInput;
> }
> @ProcessElement
> public void processElement(ProcessContext c, @TimerId("t") Timer t) {
> KV element = c.element();
> c.output(element.getKey() + c.sideInput(sideInput).get(element));
> t.offset(Duration.standardSeconds(1)).setRelative();
> }
> @OnTimer("t")
> public void onTimerFire(OnTimerContext x) {
> x.output("Timer fired");
> }
> }
> @Test
> public void testCrashesForTimerAndSideInput() {
> ImmutableMap sideData = ImmutableMap. String>builder().
> put("x", "X").
> put("y", "Y").
> build();
> PCollectionView> sideInput =
> p.apply(Create.of(sideData)).apply(View.asMap());
> TestStream testStream = 
> TestStream.create(StringUtf8Coder.of()).
> addElements("x").
> advanceProcessingTime(Duration.standardSeconds(1)).
> addElements("y").
> advanceProcessingTime(Duration.standardSeconds(1)).
> advanceWatermarkToInfinity();
> PCollection result = p.
> apply(testStream).
> apply(MapElements.into(kvs(strings(), strings())).via(v -> 
> KV.of(v, v))).
> apply(ParDo.of(new 
> DoFnWithTimer(sideInput)).withSideInputs(sideInput));
> PAssert.that(result).containsInAnyOrder("xX", "yY", "Timer fired");
> p.run();
> }
> }
> {code}
>  
> and the error is:
> {code}
> java.lang.IllegalArgumentException: expected one element but was: 
>  KeyedWorkItem/ParMultiDo(ToKeyedWorkItem).output [PCollection], 
> View.AsMap/View.VoidKeyToMultimapMaterialization/ParDo(VoidKeyToMultimapMaterialization)/ParMultiDo(VoidKeyToMultimapMaterialization).output
>  [PCollection]>
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:322)
>   at 
> org.apache.beam.repackaged.beam_runners_direct_java.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:294)
>   at 
> org.apache.beam.runners.direct.QuiescenceDriver.fireTimers(QuiescenceDriver.java:167)
>   at 
> 

[jira] [Assigned] (BEAM-5875) Nexmark perf tests fail due to NoClassDefFoundError for Iterables.

2018-10-26 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5875:
-

Assignee: Kenneth Knowles  (was: Lukasz Gajowy)

> Nexmark perf tests fail due to NoClassDefFoundError for Iterables.
> --
>
> Key: BEAM-5875
> URL: https://issues.apache.org/jira/browse/BEAM-5875
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Henning Rohde
>Assignee: Kenneth Knowles
>Priority: Critical
>  Labels: currently-failing
>
> https://scans.gradle.com/s/vjkiys2xc3age/console-log?task=:beam-sdks-java-nexmark:run
> I see:
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/beam/repackaged/beam_sdks_java_test_utils/com/google/common/collect/Iterables
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.checkIfMetricResultIsUnique(MetricsReader.java:128)
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.getCounterMetric(MetricsReader.java:65)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.currentPerf(NexmarkLauncher.java:250)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.monitor(NexmarkLauncher.java:435)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.run(NexmarkLauncher.java:1156)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:108)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:96)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.repackaged.beam_sdks_java_test_utils.com.google.common.collect.Iterables
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 13 more
> PRs for the first red run:
> [BEAM-5716] Move nexmark to "testing" directory in java sdk (commit: 0074138) 
> (detail / githubweb)
> [BEAM-5716] Move load-tests code to "testing" directory in java sdk (commit: 
> 6674c9d) (detail / githubweb)
> [BEAM-5716] Create module for testing utils (commit: 0628951) (detail / 
> githubweb)
> [BEAM-5716] Extract MetricReader class, test it and use in Nexmark code 
> (commit: 69730fc) (detail / githubweb)
> [BEAM-5355] Use MetricsReader in GroupByKeyLoadTest (commit: 7374eb6) (detail 
> / githubweb)
> Łukasz -- would you mind taking a look? Looks like a shading issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5875) Nexmark perf tests fail due to NoClassDefFoundError for Iterables.

2018-10-26 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665835#comment-16665835
 ] 

Kenneth Knowles commented on BEAM-5875:
---

Upon further inspection it looks like 69730fc4c66adba7cc3ea10866b340046dc5f87e, 
the bit that makes Nexmark use the MetricsReader, is more likely.

> Nexmark perf tests fail due to NoClassDefFoundError for Iterables.
> --
>
> Key: BEAM-5875
> URL: https://issues.apache.org/jira/browse/BEAM-5875
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Henning Rohde
>Assignee: Lukasz Gajowy
>Priority: Critical
>  Labels: currently-failing
>
> https://scans.gradle.com/s/vjkiys2xc3age/console-log?task=:beam-sdks-java-nexmark:run
> I see:
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/beam/repackaged/beam_sdks_java_test_utils/com/google/common/collect/Iterables
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.checkIfMetricResultIsUnique(MetricsReader.java:128)
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.getCounterMetric(MetricsReader.java:65)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.currentPerf(NexmarkLauncher.java:250)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.monitor(NexmarkLauncher.java:435)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.run(NexmarkLauncher.java:1156)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:108)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:96)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.repackaged.beam_sdks_java_test_utils.com.google.common.collect.Iterables
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 13 more
> PRs for the first red run:
> [BEAM-5716] Move nexmark to "testing" directory in java sdk (commit: 0074138) 
> (detail / githubweb)
> [BEAM-5716] Move load-tests code to "testing" directory in java sdk (commit: 
> 6674c9d) (detail / githubweb)
> [BEAM-5716] Create module for testing utils (commit: 0628951) (detail / 
> githubweb)
> [BEAM-5716] Extract MetricReader class, test it and use in Nexmark code 
> (commit: 69730fc) (detail / githubweb)
> [BEAM-5355] Use MetricsReader in GroupByKeyLoadTest (commit: 7374eb6) (detail 
> / githubweb)
> Łukasz -- would you mind taking a look? Looks like a shading issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5880) Cannot explicitly run Jenkins job with `origin/pr/1234/head` any more

2018-10-26 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5880:
-

Assignee: Alan Myrvold  (was: Luke Cwik)

> Cannot explicitly run Jenkins job with `origin/pr/1234/head` any more
> -
>
> Key: BEAM-5880
> URL: https://issues.apache.org/jira/browse/BEAM-5880
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kenneth Knowles
>Assignee: Alan Myrvold
>Priority: Major
>
> The fetch spec for our Jenkins jobs has been narrowed so only the phrase 
> triggering PR or existing branches are available. You cannot trigger a job on 
> a PR through the Jenkins UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5875) Nexmark perf tests fail due to NoClassDefFoundError for Iterables.

2018-10-26 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665830#comment-16665830
 ] 

Kenneth Knowles edited comment on BEAM-5875 at 10/27/18 2:12 AM:
-

FWIW those are commit titles not PRs :-). They are all in 
https://github.com/apache/beam/pull/6725. It seems possible that just 
00741387b65b16986e46005bf83bfb1638aca946 is the problem, as it is the one that 
touched Nexmark.


was (Author: kenn):
You've linked to Jiras, not PRs. In this case, I think 
https://github.com/apache/beam/pull/6725 is the likely culprit. It seems 
possible that just 00741387b65b16986e46005bf83bfb1638aca946 is the problem. 
This should be isolated by commit since they are somewhat independent.

> Nexmark perf tests fail due to NoClassDefFoundError for Iterables.
> --
>
> Key: BEAM-5875
> URL: https://issues.apache.org/jira/browse/BEAM-5875
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Henning Rohde
>Assignee: Lukasz Gajowy
>Priority: Critical
>  Labels: currently-failing
>
> https://scans.gradle.com/s/vjkiys2xc3age/console-log?task=:beam-sdks-java-nexmark:run
> I see:
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/beam/repackaged/beam_sdks_java_test_utils/com/google/common/collect/Iterables
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.checkIfMetricResultIsUnique(MetricsReader.java:128)
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.getCounterMetric(MetricsReader.java:65)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.currentPerf(NexmarkLauncher.java:250)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.monitor(NexmarkLauncher.java:435)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.run(NexmarkLauncher.java:1156)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:108)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:96)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.repackaged.beam_sdks_java_test_utils.com.google.common.collect.Iterables
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 13 more
> PRs for the first red run:
> [BEAM-5716] Move nexmark to "testing" directory in java sdk (commit: 0074138) 
> (detail / githubweb)
> [BEAM-5716] Move load-tests code to "testing" directory in java sdk (commit: 
> 6674c9d) (detail / githubweb)
> [BEAM-5716] Create module for testing utils (commit: 0628951) (detail / 
> githubweb)
> [BEAM-5716] Extract MetricReader class, test it and use in Nexmark code 
> (commit: 69730fc) (detail / githubweb)
> [BEAM-5355] Use MetricsReader in GroupByKeyLoadTest (commit: 7374eb6) (detail 
> / githubweb)
> Łukasz -- would you mind taking a look? Looks like a shading issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5880) Cannot explicitly run Jenkins job with `origin/pr/1234/head` any more

2018-10-26 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5880:
-

 Summary: Cannot explicitly run Jenkins job with 
`origin/pr/1234/head` any more
 Key: BEAM-5880
 URL: https://issues.apache.org/jira/browse/BEAM-5880
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Kenneth Knowles
Assignee: Luke Cwik


The fetch spec for our Jenkins jobs has been narrowed so only the phrase 
triggering PR or existing branches are available. You cannot trigger a job on a 
PR through the Jenkins UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5875) Nexmark perf tests fail due to NoClassDefFoundError for Iterables.

2018-10-26 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665830#comment-16665830
 ] 

Kenneth Knowles commented on BEAM-5875:
---

You've linked to Jiras, not PRs. In this case, I think 
https://github.com/apache/beam/pull/6725 is the likely culprit. It seems 
possible that just 00741387b65b16986e46005bf83bfb1638aca946 is the problem. 
This should be isolated by commit since they are somewhat independent.

> Nexmark perf tests fail due to NoClassDefFoundError for Iterables.
> --
>
> Key: BEAM-5875
> URL: https://issues.apache.org/jira/browse/BEAM-5875
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Henning Rohde
>Assignee: Lukasz Gajowy
>Priority: Critical
>  Labels: currently-failing
>
> https://scans.gradle.com/s/vjkiys2xc3age/console-log?task=:beam-sdks-java-nexmark:run
> I see:
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/beam/repackaged/beam_sdks_java_test_utils/com/google/common/collect/Iterables
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.checkIfMetricResultIsUnique(MetricsReader.java:128)
>   at 
> org.apache.beam.sdk.testutils.metrics.MetricsReader.getCounterMetric(MetricsReader.java:65)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.currentPerf(NexmarkLauncher.java:250)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.monitor(NexmarkLauncher.java:435)
>   at 
> org.apache.beam.sdk.nexmark.NexmarkLauncher.run(NexmarkLauncher.java:1156)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:108)
>   at org.apache.beam.sdk.nexmark.Main$Run.call(Main.java:96)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.beam.repackaged.beam_sdks_java_test_utils.com.google.common.collect.Iterables
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 13 more
> PRs for the first red run:
> [BEAM-5716] Move nexmark to "testing" directory in java sdk (commit: 0074138) 
> (detail / githubweb)
> [BEAM-5716] Move load-tests code to "testing" directory in java sdk (commit: 
> 6674c9d) (detail / githubweb)
> [BEAM-5716] Create module for testing utils (commit: 0628951) (detail / 
> githubweb)
> [BEAM-5716] Extract MetricReader class, test it and use in Nexmark code 
> (commit: 69730fc) (detail / githubweb)
> [BEAM-5355] Use MetricsReader in GroupByKeyLoadTest (commit: 7374eb6) (detail 
> / githubweb)
> Łukasz -- would you mind taking a look? Looks like a shading issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5960) New Nexmark test activated automatically, causing postcommit failures

2018-11-02 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5960:
-

 Summary: New Nexmark test activated automatically, causing 
postcommit failures
 Key: BEAM-5960
 URL: https://issues.apache.org/jira/browse/BEAM-5960
 Project: Beam
  Issue Type: Bug
  Components: examples-nexmark
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5960) New Nexmark test activated automatically, causing postcommit failures

2018-11-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-5960.
---
   Resolution: Fixed
Fix Version/s: Not applicable

Rolled it back

> New Nexmark test activated automatically, causing postcommit failures
> -
>
> Key: BEAM-5960
> URL: https://issues.apache.org/jira/browse/BEAM-5960
> Project: Beam
>  Issue Type: Bug
>  Components: examples-nexmark
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5961) No precommit coverage for Nexmark postcommit main entry point

2018-11-02 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5961:
-

 Summary: No precommit coverage for Nexmark postcommit main entry 
point
 Key: BEAM-5961
 URL: https://issues.apache.org/jira/browse/BEAM-5961
 Project: Beam
  Issue Type: Bug
  Components: examples-nexmark
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


There's a decent amount of logic in Nexmark's {{Main}} class and 
{{NexmarkLauncher}}, neither of which have any tests. It is extremely easy to 
make a change that breaks the Nexmark postcommits with no signal before a PR is 
merged.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5937) Failure in portable Flink tests: org.apache.beam.runners.flink.PortableStateExecutionTest.testExecution

2018-11-01 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5937:
-

 Summary: Failure in portable Flink tests: 
org.apache.beam.runners.flink.PortableStateExecutionTest.testExecution
 Key: BEAM-5937
 URL: https://issues.apache.org/jira/browse/BEAM-5937
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kenneth Knowles


This showed up on an unrelated presubmit run. I don't know yet if it is a 
flake, but it shouldn't have gotten past prior presubmit if it were not, so 
labeling as a flake.

[https://builds.apache.org/job/beam_PreCommit_Java_Commit/2238/]

[https://scans.gradle.com/s/rlvn5kraa73ca/tests/jawkrcfx7mcj4-ly76da5r7qdkk]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5951) Nexmark tests include asserts of ordering of PCollections, which Beam does not support

2018-11-02 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5951:
-

 Summary: Nexmark tests include asserts of ordering of 
PCollections, which Beam does not support
 Key: BEAM-5951
 URL: https://issues.apache.org/jira/browse/BEAM-5951
 Project: Beam
  Issue Type: Bug
  Components: examples-nexmark
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5035) beam_PostCommit_Java_GradleBuild/1105 :beam-examples-java:compileTestJava FAILED

2018-10-25 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664576#comment-16664576
 ] 

Kenneth Knowles commented on BEAM-5035:
---

Notably, {{shadowTest}} is something that we made up, not part of the shadow 
plugin automatically. Both bugs here are tests depending on other test jars. In 
the ApexRunner case it is sensible (for ValidatesRunner suite) but in this case 
GCPIO should not have its test jar used as a dependency. So we could isolate 
this to the ApexRunner very quickly by fixing the examples / GCPIO relationship.

> beam_PostCommit_Java_GradleBuild/1105 :beam-examples-java:compileTestJava 
> FAILED
> 
>
> Key: BEAM-5035
> URL: https://issues.apache.org/jira/browse/BEAM-5035
> Project: Beam
>  Issue Type: Improvement
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Andrew Pilloud
>Priority: Critical
> Fix For: 2.7.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Compilation failed for
> [https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1105/]
>  > Task :beam-examples-java:compileTestJava FAILED
>  
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_GradleBuild/src/examples/java/src/test/java/org/apache/beam/examples/cookbook/BigQueryTornadoesIT.java:22:
>  error: cannot access BigqueryMatcher
>  import org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher;
>  ^
>  bad class file: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Java_GradleBuild/src/sdks/java/io/google-cloud-platform/build/libs/beam-sdks-java-io-google-cloud-platform-2.7.0-SNAPSHOT-tests.jar(/org/apache/beam/sdk/io/gcp/testing/BigqueryMatcher.class)
>  unable to access file: java.util.zip.ZipException: invalid stored block 
> lengths
>  Please remove or make sure it appears in the correct subdirectory of the 
> classpath.
>  1 error
>  
> https://github.com/apache/beam/blame/328129bf033bc6be16bc8e09af905f37b7516412/examples/java/src/test/java/org/apache/beam/examples/cookbook/BigQueryTornadoesIT.java
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5847) Separate mechanism from policy: PreCommit is policy, the set of tests is a named suite

2018-10-23 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5847:
-

 Summary: Separate mechanism from policy: PreCommit is policy, the 
set of tests is a named suite
 Key: BEAM-5847
 URL: https://issues.apache.org/jira/browse/BEAM-5847
 Project: Beam
  Issue Type: Improvement
  Components: build-system
Reporter: Kenneth Knowles


Throughout our build.gradle files we have sprinkled "preCommit" tasks. It 
obscures what is run sometimes.

One consistent way of managing this is to have modules define test suites and 
to have centralized management of which suites are pre/post commit, defined 
entirely in the root build.gradle. We are almost doing this already, except 
for...

The other way of doing it which is to let modules request which suites should 
be pre/post commits and the root build.gradle is expected to call those tasks.

It isn't really clear what the intent of our tasks are right now, to me anyhow. 
I think they've organically grown and could now be put in order a bit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5833) Java precommit broken by CloseableFnDataReceiver#flush as well as Spotless failures

2018-10-23 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-5833.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Java precommit broken by CloseableFnDataReceiver#flush as well as Spotless 
> failures
> ---
>
> Key: BEAM-5833
> URL: https://issues.apache.org/jira/browse/BEAM-5833
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness, test-failures
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: Not applicable
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://builds.apache.org/job/beam_PreCommit_Java_Cron/498/
> https://scans.gradle.com/s/big7mxohgz2ry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5907) Dataflow legacy worker test suite runs portability tests

2018-10-29 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5907:
-

 Summary: Dataflow legacy worker test suite runs portability tests
 Key: BEAM-5907
 URL: https://issues.apache.org/jira/browse/BEAM-5907
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Kenneth Knowles
Assignee: Henning Rohde


Some tests of portability bits failed in a test suite for the legacy worker. 
Could be a naming problem or a configuration problem. Notably, they failed due 
to changes in unshaded test jars, which no one should be using.

https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1778/#showFailuresLink



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5908) Postcommit failure after fixing unshaded test jar classifier

2018-10-29 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-5908:
-

 Summary: Postcommit failure after fixing unshaded test jar 
classifier
 Key: BEAM-5908
 URL: https://issues.apache.org/jira/browse/BEAM-5908
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1778/#showFailuresLink



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5907) Dataflow legacy worker test suite runs portability tests

2018-10-29 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667515#comment-16667515
 ] 

Kenneth Knowles commented on BEAM-5907:
---

This is not currently failing, as I've re-disabled parallel builds. But it is 
expected that fixing or sickbaying such tests will allow parallel builds.

> Dataflow legacy worker test suite runs portability tests
> 
>
> Key: BEAM-5907
> URL: https://issues.apache.org/jira/browse/BEAM-5907
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Boyuan Zhang
>Priority: Major
>
> Some tests of portability bits failed in a test suite for the legacy worker. 
> Could be a naming problem or a configuration problem. Notably, they failed 
> due to changes in unshaded test jars, which no one should be using.
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1778/#showFailuresLink



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5907) Dataflow legacy worker test suite runs portability tests

2018-10-29 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667417#comment-16667417
 ] 

Kenneth Knowles commented on BEAM-5907:
---

Noting that the problem was in mockito argument captor verification, a practice 
that is explicitly not supported across multiple threads.

> Dataflow legacy worker test suite runs portability tests
> 
>
> Key: BEAM-5907
> URL: https://issues.apache.org/jira/browse/BEAM-5907
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Henning Rohde
>Priority: Major
>
> Some tests of portability bits failed in a test suite for the legacy worker. 
> Could be a naming problem or a configuration problem. Notably, they failed 
> due to changes in unshaded test jars, which no one should be using.
> https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/1778/#showFailuresLink



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3680) Revise code on state & timers blog posts

2018-10-29 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-3680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667458#comment-16667458
 ] 

Kenneth Knowles commented on BEAM-3680:
---

So you say, but I tell you there is still a construction-time error bug :-)

> Revise code on state & timers blog posts
> 
>
> Key: BEAM-3680
> URL: https://issues.apache.org/jira/browse/BEAM-3680
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> It looks like the code is either truncated wrong when I moved it to the post, 
> or implementation has skewed. Perhaps it can be integrated with our snippets 
> capabilities.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2018-11-05 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675384#comment-16675384
 ] 

Kenneth Knowles commented on BEAM-5967:
---

I looked here: 
https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/DynamicMessage

It seems like it has all the necessary methods. Do they crash? If a 
DynamicMessage has a proto schema available at runtime this seems like it 
should work.

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Priority: Major
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5960) New Nexmark test activated automatically, causing postcommit failures

2018-11-05 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675367#comment-16675367
 ] 

Kenneth Knowles commented on BEAM-5960:
---

It was postCommit failure because the {{main()}} codepath didn't set up the 
side input. I was going to do that in a separate step, but I didn't realize it 
would automatically start running. I added a new precommit test that will catch 
it.

As for number vs name, maybe let's discuss on the rollforward PR 
https://github.com/apache/beam/pull/6933. Personally, I think the numbers are 
not great since if we actually get a big suite of benchmarks, we are going to 
need to use names to keep track of them. So the numbers are just for 
compatibility with the dashboards and to make the historical perf numbers match 
up.

> New Nexmark test activated automatically, causing postcommit failures
> -
>
> Key: BEAM-5960
> URL: https://issues.apache.org/jira/browse/BEAM-5960
> Project: Beam
>  Issue Type: Bug
>  Components: examples-nexmark
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Critical
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2018-11-05 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5967:
-

Assignee: (was: Kenneth Knowles)

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Priority: Major
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2018-11-05 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675375#comment-16675375
 ] 

Kenneth Knowles commented on BEAM-5967:
---

[~alexvanboxel] I don't understand the details here. Can the 
`protoMessageClass` still provide a parser? Then it would be fine. Would you 
like to try to add this support?

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Assignee: Kenneth Knowles
>Priority: Major
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2018-11-05 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675744#comment-16675744
 ] 

Kenneth Knowles commented on BEAM-5967:
---

So the hard question is where are you going to get that argument? If you 
provide it to a manually-instantiated coder then it should be pretty easy. If 
the code ends up with conditionals in every method, then it would make sense to 
have a separate class, but that is just an implementation detail.

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Priority: Major
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5967) ProtoCoder doesn't support DynamicMessage

2018-11-05 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675744#comment-16675744
 ] 

Kenneth Knowles edited comment on BEAM-5967 at 11/5/18 8:53 PM:


So the question is where are you going to get that argument? If you provide it 
to a manually-instantiated coder then it should be pretty easy. If the code 
ends up with conditionals in every method, then it would make sense to have a 
separate class, but that is just an implementation detail. If you are going to 
include it in the data then it could be very inefficient (I think we have some 
Avro cases that do this :-/ )


was (Author: kenn):
So the hard question is where are you going to get that argument? If you 
provide it to a manually-instantiated coder then it should be pretty easy. If 
the code ends up with conditionals in every method, then it would make sense to 
have a separate class, but that is just an implementation detail.

> ProtoCoder doesn't support DynamicMessage
> -
>
> Key: BEAM-5967
> URL: https://issues.apache.org/jira/browse/BEAM-5967
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Alex Van Boxel
>Priority: Major
>
> The ProtoCoder does make some assumptions about static messages being 
> available. The DynamicMessage doesn't have some of them, mainly because the 
> proto schema is defined at runtime and not at compile time.
> Does it make sense to make a special coder for DynamicMessage or build it 
> into the normal ProtoCoder.
> Here is an example of the assumtion being made in the current Codec:
> {code:java}
> try {
>   @SuppressWarnings("unchecked")
>   T protoMessageInstance = (T) 
> protoMessageClass.getMethod("getDefaultInstance").invoke(null);
>   @SuppressWarnings("unchecked")
>   Parser tParser = (Parser) protoMessageInstance.getParserForType();
>   memoizedParser = tParser;
> } catch (IllegalAccessException | InvocationTargetException | 
> NoSuchMethodException e) {
>   throw new IllegalArgumentException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-6002:
-

Assignee: Chamikara Jayalath  (was: Lukasz Gajowy)

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Chamikara Jayalath
>Priority: Critical
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/
> {code}
> 08:58:26 2018-11-06T16:58:26.035Z RUNNING Query0
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime for namespace Query0.Events
> 08:58:26 2018-11-06T16:58:26.035Z no activity
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.endTime for namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.elements, from namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.bytes, from namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTime for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTime for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTimestamp for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTimestamp for namespace Query0.Results
> 08:58:41 2018-11-06T16:58:41.035Z RUNNING Query0
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:41 2018-11-06T16:58:41.036Z no activity
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime for namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.endTime for namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.elements, from namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.bytes, from namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTime for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTime for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTimestamp for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTimestamp for namespace Query0.Results
> 08:58:56 2018-11-06T16:58:56.036Z RUNNING Query0
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:56 2018-11-06T16:58:56.036Z no activity
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime 

[jira] [Commented] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677363#comment-16677363
 ] 

Kenneth Knowles commented on BEAM-6002:
---

[~reuvenlax] the other commits that look in common between a couple have your 
name attached, but I have no clue how they would affect this

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Lukasz Gajowy
>Priority: Critical
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/
> {code}
> 08:58:26 2018-11-06T16:58:26.035Z RUNNING Query0
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime for namespace Query0.Events
> 08:58:26 2018-11-06T16:58:26.035Z no activity
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.endTime for namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.elements, from namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.bytes, from namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTime for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTime for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTimestamp for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTimestamp for namespace Query0.Results
> 08:58:41 2018-11-06T16:58:41.035Z RUNNING Query0
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:41 2018-11-06T16:58:41.036Z no activity
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime for namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.endTime for namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.elements, from namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.bytes, from namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTime for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTime for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTimestamp for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTimestamp for namespace Query0.Results
> 08:58:56 2018-11-06T16:58:56.036Z RUNNING Query0
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:56 2018-11-06T16:58:56.036Z no activity
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:56 18/11/06 16:58:56 ERROR 
> 

[jira] [Commented] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677339#comment-16677339
 ] 

Kenneth Knowles commented on BEAM-6002:
---

[~echauchot] too

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Lukasz Gajowy
>Priority: Critical
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6036) How to periodically refresh side inputs

2018-11-15 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-6036.
---
   Resolution: Not A Problem
Fix Version/s: Not applicable

> How to periodically refresh side inputs
> ---
>
> Key: BEAM-6036
> URL: https://issues.apache.org/jira/browse/BEAM-6036
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Evgeny
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: Not applicable
>
>
> I have followed the example provided here 
> [https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-case-patterns-part-1]
>  in the "Pattern: Slowly-changing lookup cache" section. I've converted the 
> pseudo-code from the article into this Java code:
> return p
> .apply("GenerateSequence", GenerateSequence.from(0).withRate(1, 
> Duration.standardHours(1)))
>  .apply("GenerateSequenceWindow",
>  Window.into(new GlobalWindows()).triggering(
>  Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane()))
>  .discardingFiredPanes())
>  .apply("RetrieveKVs",
>  ParDo.of(new RetrieveKVs()))
>  .apply("ToMap", View.asMap());
> RetrieveKVs() queries BigQuery table and outputs KVs. 
> The issue here is that the resulting map mixes up KVs from different periods 
> (i.e. the sequence is generated every 1 hour, the resulting map includes KVs 
> from 2 adjacent hours).
> In an attempt to solve it I tried using View.asSingleton() instead.
> return p
> .apply("GenerateSequence", GenerateSequence.from(0).withRate(1, 
> Duration.standardHours(1)))
> .apply("GenerateSequenceWindow",
> Window.into(new GlobalWindows()).triggering(
> 
> Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane()))
> .discardingFiredPanes())
> .apply("RetrieveMap",
> ParDo.of(new RetrieveMap()))
> .apply("ToMap", View.asSingleton());
> RetrieveMap queries data from BigQuery and outputs the complete map. The 
> issue with this is it not only results in flaky tests with the exception 1 
> times out of 10:
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a def
> ault value
> but also it doesn't seem to work. In the logs I see the RetrieveMap is called 
> every hour, but the pipeline using the side input gets stale data. 
> Is there a real working example for how to make a side input refresh 
> periodically? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6036) How to periodically refresh side inputs

2018-11-15 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688739#comment-16688739
 ] 

Kenneth Knowles commented on BEAM-6036:
---

This is a good question for StackOverflow. You have windowed into 
GlobalWindows, so all the KVs are in the same window. That is why they are all 
being returned as part of the View.asMap() side input.

> How to periodically refresh side inputs
> ---
>
> Key: BEAM-6036
> URL: https://issues.apache.org/jira/browse/BEAM-6036
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Evgeny
>Assignee: Kenneth Knowles
>Priority: Blocker
>
> I have followed the example provided here 
> [https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-case-patterns-part-1]
>  in the "Pattern: Slowly-changing lookup cache" section. I've converted the 
> pseudo-code from the article into this Java code:
> return p
> .apply("GenerateSequence", GenerateSequence.from(0).withRate(1, 
> Duration.standardHours(1)))
>  .apply("GenerateSequenceWindow",
>  Window.into(new GlobalWindows()).triggering(
>  Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane()))
>  .discardingFiredPanes())
>  .apply("RetrieveKVs",
>  ParDo.of(new RetrieveKVs()))
>  .apply("ToMap", View.asMap());
> RetrieveKVs() queries BigQuery table and outputs KVs. 
> The issue here is that the resulting map mixes up KVs from different periods 
> (i.e. the sequence is generated every 1 hour, the resulting map includes KVs 
> from 2 adjacent hours).
> In an attempt to solve it I tried using View.asSingleton() instead.
> return p
> .apply("GenerateSequence", GenerateSequence.from(0).withRate(1, 
> Duration.standardHours(1)))
> .apply("GenerateSequenceWindow",
> Window.into(new GlobalWindows()).triggering(
> 
> Repeatedly.forever(AfterProcessingTime.pastFirstElementInPane()))
> .discardingFiredPanes())
> .apply("RetrieveMap",
> ParDo.of(new RetrieveMap()))
> .apply("ToMap", View.asSingleton());
> RetrieveMap queries data from BigQuery and outputs the complete map. The 
> issue with this is it not only results in flaky tests with the exception 1 
> times out of 10:
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a def
> ault value
> but also it doesn't seem to work. In the logs I see the RetrieveMap is called 
> every hour, but the pipeline using the side input gets stale data. 
> Is there a real working example for how to make a side input refresh 
> periodically? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677678#comment-16677678
 ] 

Kenneth Knowles commented on BEAM-6002:
---

Spark Nexmark healed itself (or got lucky).

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Chamikara Jayalath
>Priority: Critical
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/
> {code}
> 08:58:26 2018-11-06T16:58:26.035Z RUNNING Query0
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime for namespace Query0.Events
> 08:58:26 2018-11-06T16:58:26.035Z no activity
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.endTime for namespace Query0.Events
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.elements, from namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.bytes, from namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTime for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTime for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTimestamp for namespace Query0.Results
> 08:58:26 18/11/06 16:58:26 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTimestamp for namespace Query0.Results
> 08:58:41 2018-11-06T16:58:41.035Z RUNNING Query0
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:41 2018-11-06T16:58:41.036Z no activity
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.startTime for namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric event.endTime for namespace Query0.Events
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.elements, from namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> result.bytes, from namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTime for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTime for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.startTimestamp for namespace Query0.Results
> 08:58:41 18/11/06 16:58:41 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution metric result.endTimestamp for namespace Query0.Results
> 08:58:56 2018-11-06T16:58:56.036Z RUNNING Query0
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.elements, from namespace Query0.Events
> 08:58:56 2018-11-06T16:58:56.036Z no activity
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
> event.bytes, from namespace Query0.Events
> 08:58:56 18/11/06 16:58:56 ERROR 
> org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get 
> distribution 

[jira] [Created] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-6002:
-

 Summary: Nexmark tests timing out on all runners (crash loop due 
to metrics?) 
 Key: BEAM-6002
 URL: https://issues.apache.org/jira/browse/BEAM-6002
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kenneth Knowles
Assignee: Lukasz Gajowy


https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677362#comment-16677362
 ] 

Kenneth Knowles commented on BEAM-6002:
---

https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Direct/1041/

Also hanging for hours.

{code}
17:14:20 Configurations:
17:14:20   Conf  Description
17:14:20     query:PASSTHROUGH; exportSummaryToBigQuery:true
17:14:20   0001  query:CURRENCY_CONVERSION; exportSummaryToBigQuery:true
17:14:20   0002  query:SELECTION; exportSummaryToBigQuery:true
17:14:20   0003  query:LOCAL_ITEM_SUGGESTION; exportSummaryToBigQuery:true
17:14:20   0004  query:AVERAGE_PRICE_FOR_CATEGORY; 
exportSummaryToBigQuery:true; numEvents:1
17:14:20   0005  query:HOT_ITEMS; exportSummaryToBigQuery:true
17:14:20   0006  query:AVERAGE_SELLING_PRICE_BY_SELLER; 
exportSummaryToBigQuery:true; numEvents:1
17:14:20   0007  query:HIGHEST_BID; exportSummaryToBigQuery:true
17:14:20   0008  query:MONITOR_NEW_USERS; exportSummaryToBigQuery:true
17:14:20   0009  query:WINNING_BIDS; exportSummaryToBigQuery:true; 
numEvents:1
17:14:20   0010  query:LOG_TO_SHARDED_FILES; exportSummaryToBigQuery:true
17:14:20   0011  query:USER_SESSIONS; exportSummaryToBigQuery:true
17:14:20   0012  query:PROCESSING_TIME_WINDOWS; exportSummaryToBigQuery:true
17:14:20   0013  query:BOUNDED_SIDE_INPUT_JOIN; exportSummaryToBigQuery:true
17:14:20 
17:14:20 Performance:
17:14:20   Conf  Runtime(sec)(Baseline)  Events(/sec)(Baseline)   
Results(Baseline)
17:14:20      6.7 14941.0  
10  
17:14:20   0001   3.6 27948.6   
92000  
17:14:20   0002  *** not run ***
17:14:20   0003  *** not run ***
17:14:20   0004  *** not run ***
17:14:20   0005  *** not run ***
17:14:20   0006  *** not run ***
17:14:20   0007  *** not run ***
17:14:20   0008  *** not run ***
17:14:20   0009  *** not run ***
17:14:20   0010  *** not run ***
17:14:20   0011  *** not run ***
17:14:20   0012  *** not run ***
17:14:20   0013  *** not run ***
17:14:20 
==
17:14:20 
17:14:20 2018-11-06T01:14:20.667Z Generating 10 events in batch mode
17:14:22 2018-11-06T01:14:22.295Z Waiting for main pipeline to 'finish'
17:14:22 2018-11-06T01:14:22.296Z DONE Query2
17:14:22 2018-11-06T01:14:22.296Z Running query:LOCAL_ITEM_SUGGESTION; 
exportSummaryToBigQuery:true
17:14:22 
17:14:22 
==
17:14:22 Run started 2018-11-06T01:14:08.264Z and ran for PT14.032S
17:14:22 
17:14:22 Default configuration:
17:14:22 
{"debug":true,"query":null,"sourceType":"DIRECT","sinkType":"DEVNULL","exportSummaryToBigQuery":false,"pubSubMode":"COMBINED","sideInputType":"DIRECT","sideInputRowCount":500,"sideInputNumShards":3,"sideInputUrl":null,"numEvents":10,"numEventGenerators":100,"rateShape":"SINE","firstEventRate":1,"nextEventRate":1,"rateUnit":"PER_SECOND","ratePeriodSec":600,"preloadSeconds":0,"streamTimeout":240,"isRateLimited":false,"useWallclockEventTime":false,"avgPersonByteSize":200,"avgAuctionByteSize":500,"avgBidByteSize":100,"hotAuctionRatio":2,"hotSellersRatio":4,"hotBiddersRatio":4,"windowSizeSec":10,"windowPeriodSec":5,"watermarkHoldbackSec":0,"numInFlightAuctions":100,"numActivePeople":1000,"coderStrategy":"HAND","cpuDelayMs":0,"diskBusyBytes":0,"auctionSkip":123,"fanout":5,"maxAuctionsWaitingTime":600,"occasionalDelaySec":3,"probDelayedEvent":0.1,"maxLogEvents":10,"usePubsubPublishTime":false,"outOfOrderGroupSize":1}
17:14:22 
17:14:22 Configurations:
17:14:22   Conf  Description
17:14:22     query:PASSTHROUGH; exportSummaryToBigQuery:true
17:14:22   0001  query:CURRENCY_CONVERSION; exportSummaryToBigQuery:true
17:14:22   0002  query:SELECTION; exportSummaryToBigQuery:true
17:14:22   0003  query:LOCAL_ITEM_SUGGESTION; exportSummaryToBigQuery:true
17:14:22   0004  query:AVERAGE_PRICE_FOR_CATEGORY; 
exportSummaryToBigQuery:true; numEvents:1
17:14:22   0005  query:HOT_ITEMS; exportSummaryToBigQuery:true
17:14:22   0006  query:AVERAGE_SELLING_PRICE_BY_SELLER; 
exportSummaryToBigQuery:true; numEvents:1
17:14:22   0007  query:HIGHEST_BID; exportSummaryToBigQuery:true
17:14:22   0008  query:MONITOR_NEW_USERS; exportSummaryToBigQuery:true
17:14:22   0009  query:WINNING_BIDS; exportSummaryToBigQuery:true; 
numEvents:1
17:14:22   0010  query:LOG_TO_SHARDED_FILES; exportSummaryToBigQuery:true
17:14:22   0011  query:USER_SESSIONS; exportSummaryToBigQuery:true
17:14:22   0012  query:PROCESSING_TIME_WINDOWS; exportSummaryToBigQuery:true
17:14:22   0013  query:BOUNDED_SIDE_INPUT_JOIN; exportSummaryToBigQuery:true
17:14:22 
17:14:22 Performance:
17:14:22   Conf  Runtime(sec)(Baseline)  Events(/sec)

[jira] [Commented] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677337#comment-16677337
 ] 

Kenneth Knowles commented on BEAM-6002:
---

I think we rather critically need PR triggering of lightweight Nexmark, without 
polluting the BQ tables. Wasn't that discussed and do we have a plan?

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Lukasz Gajowy
>Priority: Critical
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677353#comment-16677353
 ] 

Kenneth Knowles edited comment on BEAM-6002 at 11/6/18 10:16 PM:
-

It took a very long time to get a full log from the Flink run at 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Flink/1018/

The causes might actually be different; the history and timing isn't the same.

Build starts and finishes in 7 minutes then seems to hang. I wonder if logs are 
suppressed by configuration?

{code}
13:17:31 Nov 05, 2018 9:17:31 PM 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper
 initializeState
13:17:31 INFO: No restore state for UnbounedSourceWrapper.
13:17:31 Nov 05, 2018 9:17:31 PM 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper
 open
13:17:31 INFO: Unbounded Flink Source 13/16 is reading from sources: 
[UnboundedEventSource(13000, 14000), UnboundedEventSource(29000, 3), 
UnboundedEventSource(45000, 46000), UnboundedEventSource(61000, 62000), 
UnboundedEventSource(77000, 78000), UnboundedEventSource(93000, 94000)]
17:10:45 Build timed out (after 240 minutes). Marking the build as aborted.
17:10:45 Build was aborted
17:10:47 No emails were triggered.
17:10:47 Finished: ABORTED
{code}


was (Author: kenn):
It took a very long time to get a full log from the Flink run at 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Flink/1018/

The causes might actually be different; the history and timing isn't the same.

{code}
13:17:31 Nov 05, 2018 9:17:31 PM 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper
 initializeState
13:17:31 INFO: No restore state for UnbounedSourceWrapper.
13:17:31 Nov 05, 2018 9:17:31 PM 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper
 open
13:17:31 INFO: Unbounded Flink Source 13/16 is reading from sources: 
[UnboundedEventSource(13000, 14000), UnboundedEventSource(29000, 3), 
UnboundedEventSource(45000, 46000), UnboundedEventSource(61000, 62000), 
UnboundedEventSource(77000, 78000), UnboundedEventSource(93000, 94000)]
17:10:45 Build timed out (after 240 minutes). Marking the build as aborted.
17:10:45 Build was aborted
17:10:47 No emails were triggered.
17:10:47 Finished: ABORTED
{code}

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Lukasz Gajowy
>Priority: Critical
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677353#comment-16677353
 ] 

Kenneth Knowles commented on BEAM-6002:
---

It took a very long time to get a full log from the Flink run at 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Flink/1018/

The causes might actually be different; the history and timing isn't the same.

{code}
13:17:31 Nov 05, 2018 9:17:31 PM 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper
 initializeState
13:17:31 INFO: No restore state for UnbounedSourceWrapper.
13:17:31 Nov 05, 2018 9:17:31 PM 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper
 open
13:17:31 INFO: Unbounded Flink Source 13/16 is reading from sources: 
[UnboundedEventSource(13000, 14000), UnboundedEventSource(29000, 3), 
UnboundedEventSource(45000, 46000), UnboundedEventSource(61000, 62000), 
UnboundedEventSource(77000, 78000), UnboundedEventSource(93000, 94000)]
17:10:45 Build timed out (after 240 minutes). Marking the build as aborted.
17:10:45 Build was aborted
17:10:47 No emails were triggered.
17:10:47 Finished: ABORTED
{code}

> Nexmark tests timing out on all runners (crash loop due to metrics?) 
> -
>
> Key: BEAM-6002
> URL: https://issues.apache.org/jira/browse/BEAM-6002
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Lukasz Gajowy
>Priority: Critical
>
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6002) Nexmark tests timing out on all runners (crash loop due to metrics?)

2018-11-06 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6002:
--
Description: 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_Nexmark_Spark/992/

{code}
08:58:26 2018-11-06T16:58:26.035Z RUNNING Query0
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
event.elements, from namespace Query0.Events
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
event.bytes, from namespace Query0.Events
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric event.startTime for namespace Query0.Events
08:58:26 2018-11-06T16:58:26.035Z no activity
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric event.endTime for namespace Query0.Events
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
result.elements, from namespace Query0.Results
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
result.bytes, from namespace Query0.Results
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.startTime for namespace Query0.Results
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.endTime for namespace Query0.Results
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.startTimestamp for namespace Query0.Results
08:58:26 18/11/06 16:58:26 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.endTimestamp for namespace Query0.Results
08:58:41 2018-11-06T16:58:41.035Z RUNNING Query0
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
event.elements, from namespace Query0.Events
08:58:41 2018-11-06T16:58:41.036Z no activity
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
event.bytes, from namespace Query0.Events
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric event.startTime for namespace Query0.Events
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric event.endTime for namespace Query0.Events
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
result.elements, from namespace Query0.Results
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
result.bytes, from namespace Query0.Results
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.startTime for namespace Query0.Results
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.endTime for namespace Query0.Results
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.startTimestamp for namespace Query0.Results
08:58:41 18/11/06 16:58:41 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.endTimestamp for namespace Query0.Results
08:58:56 2018-11-06T16:58:56.036Z RUNNING Query0
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
event.elements, from namespace Query0.Events
08:58:56 2018-11-06T16:58:56.036Z no activity
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
event.bytes, from namespace Query0.Events
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric event.startTime for namespace Query0.Events
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric event.endTime for namespace Query0.Events
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
result.elements, from namespace Query0.Results
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get metric 
result.bytes, from namespace Query0.Results
08:58:56 18/11/06 16:58:56 ERROR 
org.apache.beam.sdk.testutils.metrics.MetricsReader: Failed to get distribution 
metric result.startTime for namespace Query0.Results
08:58:56 18/11/06 16:58:56 ERROR 

[jira] [Updated] (BEAM-5996) Nexmark postCommits are failing for Dataflow

2018-11-06 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5996:
--
Description: 
Here is the gradle build scan: [https://scans.gradle.com/s/co2uh5xame2pc]

[~apilloud] I took the liberty to assign it to you as you did the nexmark to 
dataflow integration. Feel free to re-assign if needed.

 
{code:java}
java.io.IOException: 
com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
{
  "code" : 429,
  "errors" : [ {
"domain" : "usageLimits",
"message" : "The total number of changes to the object 
temp-storage-for-perf-tests/nexmark/staging/beam-runners-google-cloud-dataflow-java-legacy-worker-2.9.0-SNAPSHOT-WjJ-Xb9pjar2Y6llqnj5Zg.jar
 exceeds the rate limit. Please reduce the rate of create, update, and delete 
requests.",
"reason" : "rateLimitExceeded"
  } ],
  "message" : "The total number of changes to the object 
temp-storage-for-perf-tests/nexmark/staging/beam-runners-google-cloud-dataflow-java-legacy-worker-2.9.0-SNAPSHOT-WjJ-Xb9pjar2Y6llqnj5Zg.jar
 exceeds the rate limit. Please reduce the rate of create, update, and delete 
requests."
}
at 
com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
at 
com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
at 
org.apache.beam.runners.dataflow.util.PackageUtil.$closeResource(PackageUtil.java:260)
at 
org.apache.beam.runners.dataflow.util.PackageUtil.tryStagePackage(PackageUtil.java:260)
at 
org.apache.beam.runners.dataflow.util.PackageUtil.tryStagePackageWithRetry(PackageUtil.java:203)
at 
org.apache.beam.runners.dataflow.util.PackageUtil.stagePackageSynchronously(PackageUtil.java:187)
at 
org.apache.beam.runners.dataflow.util.PackageUtil.lambda$stagePackage$1(PackageUtil.java:171)
at org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:
104)
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}

  was:
Here is the gradle build scan: https://scans.gradle.com/s/co2uh5xame2pc

[~apilloud] I took the liberty to assign it to you as you did the nexmark to 
dataflow integration. Feel free to re-assign if needed. 



> Nexmark postCommits are failing for Dataflow
> 
>
> Key: BEAM-5996
> URL: https://issues.apache.org/jira/browse/BEAM-5996
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-nexmark
>Reporter: Etienne Chauchot
>Assignee: Andrew Pilloud
>Priority: Major
>
> Here is the gradle build scan: [https://scans.gradle.com/s/co2uh5xame2pc]
> [~apilloud] I took the liberty to assign it to you as you did the nexmark to 
> dataflow integration. Feel free to re-assign if needed.
>  
> {code:java}
> java.io.IOException: 
> com.google.api.client.googleapis.json.GoogleJsonResponseException: 410 Gone
> {
>   "code" : 429,
>   "errors" : [ {
> "domain" : "usageLimits",
> "message" : "The total number of changes to the object 
> temp-storage-for-perf-tests/nexmark/staging/beam-runners-google-cloud-dataflow-java-legacy-worker-2.9.0-SNAPSHOT-WjJ-Xb9pjar2Y6llqnj5Zg.jar
>  exceeds the rate limit. Please reduce the rate of create, update, and delete 
> requests.",
> "reason" : "rateLimitExceeded"
>   } ],
>   "message" : "The total number of changes to the object 
> temp-storage-for-perf-tests/nexmark/staging/beam-runners-google-cloud-dataflow-java-legacy-worker-2.9.0-SNAPSHOT-WjJ-Xb9pjar2Y6llqnj5Zg.jar
>  exceeds the rate limit. Please reduce the rate of create, update, and delete 
> requests."
> }
> at 
> com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.waitForCompletionAndThrowIfUploadFailed(AbstractGoogleAsyncWriteChannel.java:432)
> at 
> com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel.close(AbstractGoogleAsyncWriteChannel.java:287)
> at 
> org.apache.beam.runners.dataflow.util.PackageUtil.$closeResource(PackageUtil.java:260)
> at 
> org.apache.beam.runners.dataflow.util.PackageUtil.tryStagePackage(PackageUtil.java:260)
> at 
> org.apache.beam.runners.dataflow.util.PackageUtil.tryStagePackageWithRetry(PackageUtil.java:203)
> at 
> org.apache.beam.runners.dataflow.util.PackageUtil.stagePackageSynchronously(PackageUtil.java:187)
> at 
> org.apache.beam.runners.dataflow.util.PackageUtil.lambda$stagePackage$1(PackageUtil.java:171)
> at org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:
> 104)
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> at 
> 

[jira] [Commented] (BEAM-6086) Flaky tests when using PCollectionView with View.asSingleton()

2018-11-18 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690996#comment-16690996
 ] 

Kenneth Knowles commented on BEAM-6086:
---

>From reading your code, the error message seems accurate. It is possible for 
>the main input element to access the side input when the side input collection 
>is empty. If you provide a default using {{withDefault()}} (in this case 
>perhaps an empty map) then that value will be returned.

> Flaky tests when using PCollectionView with View.asSingleton()
> --
>
> Key: BEAM-6086
> URL: https://issues.apache.org/jira/browse/BEAM-6086
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Evgeny
>Assignee: Kenneth Knowles
>Priority: Critical
>
> Here is the minimal project that demonstrates the issue 
> https://github.com/medvedev1088/dataflow-flaky-tests. About 1 time out of 10 
> the tests fail with 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: Empty PCollection accessed as a singleton 
> view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
>   at 
> com.beam.MyPipelineTest.testTokenTransfersBasic(MyPipelineTest.java:29)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.sdk.transforms.View$SingletonCombineFn.identity(View.java:378)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:436)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:384)
>   at 
> org.apache.beam.sdk.transforms.Combine$CombineFn.apply(Combine.java:353)
>   at 
> org.apache.beam.sdk.transforms.Combine$GroupedValues$1.processElement(Combine.java:2056)
> Related to https://jira.apache.org/jira/browse/BEAM-6036



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6086) Flaky tests when using PCollectionView with View.asSingleton()

2018-11-18 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-6086:
-

Assignee: (was: Kenneth Knowles)

> Flaky tests when using PCollectionView with View.asSingleton()
> --
>
> Key: BEAM-6086
> URL: https://issues.apache.org/jira/browse/BEAM-6086
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Evgeny
>Priority: Critical
>
> Here is the minimal project that demonstrates the issue 
> https://github.com/medvedev1088/dataflow-flaky-tests. About 1 time out of 10 
> the tests fail with 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: Empty PCollection accessed as a singleton 
> view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
>   at 
> com.beam.MyPipelineTest.testTokenTransfersBasic(MyPipelineTest.java:29)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.sdk.transforms.View$SingletonCombineFn.identity(View.java:378)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:436)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:384)
>   at 
> org.apache.beam.sdk.transforms.Combine$CombineFn.apply(Combine.java:353)
>   at 
> org.apache.beam.sdk.transforms.Combine$GroupedValues$1.processElement(Combine.java:2056)
> Related to https://jira.apache.org/jira/browse/BEAM-6036



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6086) Flaky tests when using PCollectionView with View.asSingleton()

2018-11-18 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690999#comment-16690999
 ] 

Kenneth Knowles commented on BEAM-6086:
---

I don't have bandwidth to look deeper into this right now - would you be 
interested?

> Flaky tests when using PCollectionView with View.asSingleton()
> --
>
> Key: BEAM-6086
> URL: https://issues.apache.org/jira/browse/BEAM-6086
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Evgeny
>Priority: Critical
>
> Here is the minimal project that demonstrates the issue 
> https://github.com/medvedev1088/dataflow-flaky-tests. About 1 time out of 10 
> the tests fail with 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: Empty PCollection accessed as a singleton 
> view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
>   at 
> com.beam.MyPipelineTest.testTokenTransfersBasic(MyPipelineTest.java:29)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.sdk.transforms.View$SingletonCombineFn.identity(View.java:378)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:436)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:384)
>   at 
> org.apache.beam.sdk.transforms.Combine$CombineFn.apply(Combine.java:353)
>   at 
> org.apache.beam.sdk.transforms.Combine$GroupedValues$1.processElement(Combine.java:2056)
> Related to https://jira.apache.org/jira/browse/BEAM-6036



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6086) Flaky tests when using PCollectionView with View.asSingleton()

2018-11-18 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6086:
--
Component/s: (was: beam-model)
 runner-direct

> Flaky tests when using PCollectionView with View.asSingleton()
> --
>
> Key: BEAM-6086
> URL: https://issues.apache.org/jira/browse/BEAM-6086
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Evgeny
>Assignee: Kenneth Knowles
>Priority: Critical
>
> Here is the minimal project that demonstrates the issue 
> https://github.com/medvedev1088/dataflow-flaky-tests. About 1 time out of 10 
> the tests fail with 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: Empty PCollection accessed as a singleton 
> view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
>   at 
> com.beam.MyPipelineTest.testTokenTransfersBasic(MyPipelineTest.java:29)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.sdk.transforms.View$SingletonCombineFn.identity(View.java:378)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:436)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:384)
>   at 
> org.apache.beam.sdk.transforms.Combine$CombineFn.apply(Combine.java:353)
>   at 
> org.apache.beam.sdk.transforms.Combine$GroupedValues$1.processElement(Combine.java:2056)
> Related to https://jira.apache.org/jira/browse/BEAM-6036



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6086) Flaky tests when using PCollectionView with View.asSingleton()

2018-11-18 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690998#comment-16690998
 ] 

Kenneth Knowles commented on BEAM-6086:
---

Hmm, upon consideration, it may be that the main input should wait for the side 
input to be ready.

> Flaky tests when using PCollectionView with View.asSingleton()
> --
>
> Key: BEAM-6086
> URL: https://issues.apache.org/jira/browse/BEAM-6086
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Evgeny
>Assignee: Kenneth Knowles
>Priority: Critical
>
> Here is the minimal project that demonstrates the issue 
> https://github.com/medvedev1088/dataflow-flaky-tests. About 1 time out of 10 
> the tests fail with 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.IllegalArgumentException: Empty PCollection accessed as a singleton 
> view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>   at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>   at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
>   at 
> com.beam.MyPipelineTest.testTokenTransfersBasic(MyPipelineTest.java:29)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>   at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> Caused by: java.lang.IllegalArgumentException: Empty PCollection accessed as 
> a singleton view. Consider setting withDefault to provide a default value
>   at 
> org.apache.beam.sdk.transforms.View$SingletonCombineFn.identity(View.java:378)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:436)
>   at 
> org.apache.beam.sdk.transforms.Combine$BinaryCombineFn.extractOutput(Combine.java:384)
>   at 
> org.apache.beam.sdk.transforms.Combine$CombineFn.apply(Combine.java:353)
>   at 
> org.apache.beam.sdk.transforms.Combine$GroupedValues$1.processElement(Combine.java:2056)
> Related to https://jira.apache.org/jira/browse/BEAM-6036



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6082) [SQL] Nexmark 5, 7 time out

2018-11-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690060#comment-16690060
 ] 

Kenneth Knowles commented on BEAM-6082:
---

Going ahead with rollback, after confirming it actually fixes.

> [SQL] Nexmark 5, 7 time out
> ---
>
> Key: BEAM-6082
> URL: https://issues.apache.org/jira/browse/BEAM-6082
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Anton Kedin
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.9.0
>
>
> SQL Nexmark Queries 5 and 7 are timing out since around October 29: 
> [https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424] 
>  
> Current suspect is https://github.com/apache/beam/pull/6856



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6082) [SQL] Nexmark 5, 7 time out

2018-11-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-6082:
-

Assignee: Kenneth Knowles  (was: Reuven Lax)

> [SQL] Nexmark 5, 7 time out
> ---
>
> Key: BEAM-6082
> URL: https://issues.apache.org/jira/browse/BEAM-6082
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Anton Kedin
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.9.0
>
>
> SQL Nexmark Queries 5 and 7 are timing out since around October 29: 
> [https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424] 
>  
> Current suspect is https://github.com/apache/beam/pull/6856



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6082) [SQL] Nexmark 5, 7 time out

2018-11-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690067#comment-16690067
 ] 

Kenneth Knowles commented on BEAM-6082:
---

It would look like [https://github.com/apache/beam/pull/6883] which was a 
rolllforward of [https://github.com/apache/beam/pull/6856] but when I roll it 
back I don't see obvious performance recovery.

> [SQL] Nexmark 5, 7 time out
> ---
>
> Key: BEAM-6082
> URL: https://issues.apache.org/jira/browse/BEAM-6082
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Reporter: Anton Kedin
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.9.0
>
>
> SQL Nexmark Queries 5 and 7 are timing out since around October 29: 
> [https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424] 
>  
> Current suspect is https://github.com/apache/beam/pull/6856



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4439) Do not convert a join that we cannot support

2018-11-16 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689947#comment-16689947
 ] 

Kenneth Knowles commented on BEAM-4439:
---

I like the general approach of changing the rule matching. Unfortunately, 
Calcite gives this error message, which is a little unclear. This error means 
that there was no way to apply the existing rules to create a non-FULL join.

> Do not convert a join that we cannot support
> 
>
> Key: BEAM-4439
> URL: https://issues.apache.org/jira/browse/BEAM-4439
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: starter
>
> The BeamJoinRule matches all LogicalJoin operators. If we make it not match a 
> full join then Calcite should look for a different plan when it converts to 
> BeamLogicalConvention. That is a good step to make sure we can support things 
> and also force the logical optimizer to choose non-full joins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5729) Create ability to read/write database implementing database/sql contract

2018-11-16 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-5729:
-

Assignee: Adrian Witas

> Create ability to read/write database implementing database/sql  contract
> -
>
> Key: BEAM-5729
> URL: https://issues.apache.org/jira/browse/BEAM-5729
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.7.0
>Reporter: Adrian Witas
>Assignee: Adrian Witas
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4750) Beam performance degraded significantly since 2.4

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-4750:
-

Assignee: Kenneth Knowles  (was: Jean-Baptiste Onofré)

> Beam performance degraded significantly since 2.4
> -
>
> Key: BEAM-4750
> URL: https://issues.apache.org/jira/browse/BEAM-4750
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0
>Reporter: Vojtech Janota
>Assignee: Kenneth Knowles
>Priority: Critical
>
> Starting from Beam 2.4 onwards the *InMemoryStateInternals* class in the 
> *beam-runners-core-java* module does an expensive Coder encode/decode 
> operation when copying object state. This has significant impact on 
> performance and pipelines that previously took low minutes do not finish 
> within hours in our case. Based on the discussion on the dev mailing list, 
> the main motivation for this change was to enforce Coder sanity, something 
> that should arguably remain within the realm of the DirectRunner and should 
> not leak into the core layer.
> Links to commits that introduced the new behaviour:
>  * [https://github.com/apache/beam/commit/32a427c]
>  * [https://github.com/apache/beam/commit/8151d82]
>  
> Additional details and surrounding discussion can be found here:
>  * 
> https://lists.apache.org/thread.html/1e329318a4dafe27b9ff304d9460d05d8966e5ceebaf4ebfb948e2b8@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6185) Upgrade to Spark 2.4.0

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732324#comment-16732324
 ] 

Kenneth Knowles commented on BEAM-6185:
---

This is marked for 2.10.0. But I would not want to make a major upgrade like 
this without some bake time. Do you have high confidence in the PR? [~iemejia] 
[~jbonofre]

> Upgrade to Spark 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5642) test_pardo_state_only flaky (times out)

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5642:
--
Labels: flake  (was: )

> test_pardo_state_only flaky (times out)
> ---
>
> Key: BEAM-5642
> URL: https://issues.apache.org/jira/browse/BEAM-5642
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
>Priority: Major
>  Labels: flake
> Fix For: 2.10.0
>
>
> [https://builds.apache.org/job/beam_PreCommit_Python_Commit/1577/consoleFull]
>  
> *16:43:20* 
> ==*16:43:20*
>  ERROR: test_pardo_state_only 
> (apache_beam.runners.portability.portable_runner_test.PortableRunnerTest)*16:43:20*
>  
> --*16:43:20*
>  Traceback (most recent call last):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py",
>  line 265, in test_pardo_state_only*16:43:20* 
> equal_to(expected))*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
>  line 423, in __exit__*16:43:20* self.run().wait_until_finish()*16:43:20* 
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 242, in wait_until_finish*16:43:20* 
> beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 363, in __next__*16:43:20* return self._next()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 348, in _next*16:43:20* self._state.condition.wait()*16:43:20*   
> File "/usr/lib/python3.5/threading.py", line 293, in wait*16:43:20* 
> waiter.acquire()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner_test.py",
>  line 68, in handler*16:43:20* raise BaseException(msg)*16:43:20* 
> BaseException: Timed out after 30 seconds.*16:43:20*  >> 
> begin captured stdout << -



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5442) PortableRunner swallows custom options for Runner

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5442:
--
Fix Version/s: (was: 2.10.0)

> PortableRunner swallows custom options for Runner
> -
>
> Key: BEAM-5442
> URL: https://issues.apache.org/jira/browse/BEAM-5442
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability, portability-flink
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> The PortableRunner doesn't pass custom PipelineOptions to the executing 
> Runner.
> Example: {{--parallelism=4}} won't be forwarded to the FlinkRunner.
> (The option is just removed during proto translation without any warning)
> We should allow some form of customization through the options, even for the 
> PortableRunner. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5642) test_pardo_state_only flaky (times out)

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5642:
--
Issue Type: Bug  (was: Improvement)

> test_pardo_state_only flaky (times out)
> ---
>
> Key: BEAM-5642
> URL: https://issues.apache.org/jira/browse/BEAM-5642
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
>Priority: Critical
>  Labels: flake
> Fix For: 2.10.0
>
>
> [https://builds.apache.org/job/beam_PreCommit_Python_Commit/1577/consoleFull]
>  
> *16:43:20* 
> ==*16:43:20*
>  ERROR: test_pardo_state_only 
> (apache_beam.runners.portability.portable_runner_test.PortableRunnerTest)*16:43:20*
>  
> --*16:43:20*
>  Traceback (most recent call last):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py",
>  line 265, in test_pardo_state_only*16:43:20* 
> equal_to(expected))*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
>  line 423, in __exit__*16:43:20* self.run().wait_until_finish()*16:43:20* 
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 242, in wait_until_finish*16:43:20* 
> beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 363, in __next__*16:43:20* return self._next()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 348, in _next*16:43:20* self._state.condition.wait()*16:43:20*   
> File "/usr/lib/python3.5/threading.py", line 293, in wait*16:43:20* 
> waiter.acquire()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner_test.py",
>  line 68, in handler*16:43:20* raise BaseException(msg)*16:43:20* 
> BaseException: Timed out after 30 seconds.*16:43:20*  >> 
> begin captured stdout << -



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5642) test_pardo_state_only flaky (times out)

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5642:
--
Priority: Critical  (was: Major)

> test_pardo_state_only flaky (times out)
> ---
>
> Key: BEAM-5642
> URL: https://issues.apache.org/jira/browse/BEAM-5642
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
>Priority: Critical
>  Labels: flake
> Fix For: 2.10.0
>
>
> [https://builds.apache.org/job/beam_PreCommit_Python_Commit/1577/consoleFull]
>  
> *16:43:20* 
> ==*16:43:20*
>  ERROR: test_pardo_state_only 
> (apache_beam.runners.portability.portable_runner_test.PortableRunnerTest)*16:43:20*
>  
> --*16:43:20*
>  Traceback (most recent call last):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py",
>  line 265, in test_pardo_state_only*16:43:20* 
> equal_to(expected))*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
>  line 423, in __exit__*16:43:20* self.run().wait_until_finish()*16:43:20* 
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 242, in wait_until_finish*16:43:20* 
> beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 363, in __next__*16:43:20* return self._next()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 348, in _next*16:43:20* self._state.condition.wait()*16:43:20*   
> File "/usr/lib/python3.5/threading.py", line 293, in wait*16:43:20* 
> waiter.acquire()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner_test.py",
>  line 68, in handler*16:43:20* raise BaseException(msg)*16:43:20* 
> BaseException: Timed out after 30 seconds.*16:43:20*  >> 
> begin captured stdout << -



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5642) test_pardo_state_only flaky (times out)

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732392#comment-16732392
 ] 

Kenneth Knowles commented on BEAM-5642:
---

[~robertwb] are you actively working on this? Consider releasing it if not.

> test_pardo_state_only flaky (times out)
> ---
>
> Key: BEAM-5642
> URL: https://issues.apache.org/jira/browse/BEAM-5642
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
>Priority: Critical
>  Labels: flake
>
> [https://builds.apache.org/job/beam_PreCommit_Python_Commit/1577/consoleFull]
>  
> *16:43:20* 
> ==*16:43:20*
>  ERROR: test_pardo_state_only 
> (apache_beam.runners.portability.portable_runner_test.PortableRunnerTest)*16:43:20*
>  
> --*16:43:20*
>  Traceback (most recent call last):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py",
>  line 265, in test_pardo_state_only*16:43:20* 
> equal_to(expected))*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
>  line 423, in __exit__*16:43:20* self.run().wait_until_finish()*16:43:20* 
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 242, in wait_until_finish*16:43:20* 
> beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 363, in __next__*16:43:20* return self._next()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 348, in _next*16:43:20* self._state.condition.wait()*16:43:20*   
> File "/usr/lib/python3.5/threading.py", line 293, in wait*16:43:20* 
> waiter.acquire()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner_test.py",
>  line 68, in handler*16:43:20* raise BaseException(msg)*16:43:20* 
> BaseException: Timed out after 30 seconds.*16:43:20*  >> 
> begin captured stdout << -



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5642) test_pardo_state_only flaky (times out)

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732391#comment-16732391
 ] 

Kenneth Knowles commented on BEAM-5642:
---

Raising priority per flake policy, and removing from release blocking list.

> test_pardo_state_only flaky (times out)
> ---
>
> Key: BEAM-5642
> URL: https://issues.apache.org/jira/browse/BEAM-5642
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Robert Bradshaw
>Priority: Critical
>  Labels: flake
>
> [https://builds.apache.org/job/beam_PreCommit_Python_Commit/1577/consoleFull]
>  
> *16:43:20* 
> ==*16:43:20*
>  ERROR: test_pardo_state_only 
> (apache_beam.runners.portability.portable_runner_test.PortableRunnerTest)*16:43:20*
>  
> --*16:43:20*
>  Traceback (most recent call last):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py",
>  line 265, in test_pardo_state_only*16:43:20* 
> equal_to(expected))*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/pipeline.py",
>  line 423, in __exit__*16:43:20* self.run().wait_until_finish()*16:43:20* 
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 242, in wait_until_finish*16:43:20* 
> beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)):*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 363, in __next__*16:43:20* return self._next()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/target/.tox/py3/lib/python3.5/site-packages/grpc/_channel.py",
>  line 348, in _next*16:43:20* self._state.condition.wait()*16:43:20*   
> File "/usr/lib/python3.5/threading.py", line 293, in wait*16:43:20* 
> waiter.acquire()*16:43:20*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/runners/portability/portable_runner_test.py",
>  line 68, in handler*16:43:20* raise BaseException(msg)*16:43:20* 
> BaseException: Timed out after 30 seconds.*16:43:20*  >> 
> begin captured stdout << -



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5723) CassandraIO is broken because of use of bad relocation of guava

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5723:
--
Fix Version/s: 2.7.1

> CassandraIO is broken because of use of bad relocation of guava
> ---
>
> Key: BEAM-5723
> URL: https://issues.apache.org/jira/browse/BEAM-5723
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-cassandra
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Arun sethia
>Assignee: João Cabrita
>Priority: Major
> Fix For: 2.7.1, 2.10.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> While using apache beam to run dataflow job to read data from BigQuery and 
> Store/Write to Cassandra with following libaries:
>  # beam-sdks-java-io-cassandra - 2.6.0
>  # beam-sdks-java-io-jdbc - 2.6.0
>  # beam-sdks-java-io-google-cloud-platform - 2.6.0
>  # beam-sdks-java-core - 2.6.0
>  # google-cloud-dataflow-java-sdk-all - 2.5.0
>  # google-api-client -1.25.0
>  
> I am getting following error at the time insert/save data to Cassandra.
> {code:java}
> [error] (run-main-0) org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NoSuchMethodError: 
> com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NoSuchMethodError: 
> com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;
>  at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>  at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>  at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>  at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6315) CountingSource returns CheckpointMark with null date

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6315:
--
Component/s: (was: sdk-java-core)

> CountingSource returns CheckpointMark with null date
> 
>
> Key: BEAM-6315
> URL: https://issues.apache.org/jira/browse/BEAM-6315
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.10.0
>
>
> When start has not been called on the Readers of CountingSource, the 
> checkpoint contains a null value which lets the AvroCoder fail with a 
> NullpointerException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4750) Beam performance degraded significantly since 2.4

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732385#comment-16732385
 ] 

Kenneth Knowles commented on BEAM-4750:
---

This has been around for a long time, though, so I don't think it is a 2.10.0 
blocker.

> Beam performance degraded significantly since 2.4
> -
>
> Key: BEAM-4750
> URL: https://issues.apache.org/jira/browse/BEAM-4750
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Vojtech Janota
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.10.0
>
>
> Starting from Beam 2.4 onwards the *InMemoryStateInternals* class in the 
> *beam-runners-core-java* module does an expensive Coder encode/decode 
> operation when copying object state. This has significant impact on 
> performance and pipelines that previously took low minutes do not finish 
> within hours in our case. Based on the discussion on the dev mailing list, 
> the main motivation for this change was to enforce Coder sanity, something 
> that should arguably remain within the realm of the DirectRunner and should 
> not leak into the core layer.
> Links to commits that introduced the new behaviour:
>  * [https://github.com/apache/beam/commit/32a427c]
>  * [https://github.com/apache/beam/commit/8151d82]
>  
> Additional details and surrounding discussion can be found here:
>  * 
> https://lists.apache.org/thread.html/1e329318a4dafe27b9ff304d9460d05d8966e5ceebaf4ebfb948e2b8@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4750) Beam performance degraded significantly since 2.4

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4750:
--
Affects Version/s: 2.6.0
   2.7.0
   2.8.0
   2.9.0

> Beam performance degraded significantly since 2.4
> -
>
> Key: BEAM-4750
> URL: https://issues.apache.org/jira/browse/BEAM-4750
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0
>Reporter: Vojtech Janota
>Assignee: Jean-Baptiste Onofré
>Priority: Critical
> Fix For: 2.10.0
>
>
> Starting from Beam 2.4 onwards the *InMemoryStateInternals* class in the 
> *beam-runners-core-java* module does an expensive Coder encode/decode 
> operation when copying object state. This has significant impact on 
> performance and pipelines that previously took low minutes do not finish 
> within hours in our case. Based on the discussion on the dev mailing list, 
> the main motivation for this change was to enforce Coder sanity, something 
> that should arguably remain within the realm of the DirectRunner and should 
> not leak into the core layer.
> Links to commits that introduced the new behaviour:
>  * [https://github.com/apache/beam/commit/32a427c]
>  * [https://github.com/apache/beam/commit/8151d82]
>  
> Additional details and surrounding discussion can be found here:
>  * 
> https://lists.apache.org/thread.html/1e329318a4dafe27b9ff304d9460d05d8966e5ceebaf4ebfb948e2b8@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4750) Beam performance degraded significantly since 2.4

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4750:
--
Priority: Critical  (was: Major)

> Beam performance degraded significantly since 2.4
> -
>
> Key: BEAM-4750
> URL: https://issues.apache.org/jira/browse/BEAM-4750
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Vojtech Janota
>Assignee: Jean-Baptiste Onofré
>Priority: Critical
> Fix For: 2.10.0
>
>
> Starting from Beam 2.4 onwards the *InMemoryStateInternals* class in the 
> *beam-runners-core-java* module does an expensive Coder encode/decode 
> operation when copying object state. This has significant impact on 
> performance and pipelines that previously took low minutes do not finish 
> within hours in our case. Based on the discussion on the dev mailing list, 
> the main motivation for this change was to enforce Coder sanity, something 
> that should arguably remain within the realm of the DirectRunner and should 
> not leak into the core layer.
> Links to commits that introduced the new behaviour:
>  * [https://github.com/apache/beam/commit/32a427c]
>  * [https://github.com/apache/beam/commit/8151d82]
>  
> Additional details and surrounding discussion can be found here:
>  * 
> https://lists.apache.org/thread.html/1e329318a4dafe27b9ff304d9460d05d8966e5ceebaf4ebfb948e2b8@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4750) Beam performance degraded significantly since 2.4

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732383#comment-16732383
 ] 

Kenneth Knowles commented on BEAM-4750:
---

This has been carried from version to version with no action. It seems clear 
that InMemoryStateInternals is used by a lot more than the direct runner, so it 
is not OK to introduce perf regressions just for the direct runner.

> Beam performance degraded significantly since 2.4
> -
>
> Key: BEAM-4750
> URL: https://issues.apache.org/jira/browse/BEAM-4750
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.4.0, 2.5.0
>Reporter: Vojtech Janota
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.10.0
>
>
> Starting from Beam 2.4 onwards the *InMemoryStateInternals* class in the 
> *beam-runners-core-java* module does an expensive Coder encode/decode 
> operation when copying object state. This has significant impact on 
> performance and pipelines that previously took low minutes do not finish 
> within hours in our case. Based on the discussion on the dev mailing list, 
> the main motivation for this change was to enforce Coder sanity, something 
> that should arguably remain within the realm of the DirectRunner and should 
> not leak into the core layer.
> Links to commits that introduced the new behaviour:
>  * [https://github.com/apache/beam/commit/32a427c]
>  * [https://github.com/apache/beam/commit/8151d82]
>  
> Additional details and surrounding discussion can be found here:
>  * 
> https://lists.apache.org/thread.html/1e329318a4dafe27b9ff304d9460d05d8966e5ceebaf4ebfb948e2b8@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4750) Beam performance degraded significantly since 2.4

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4750:
--
Fix Version/s: (was: 2.10.0)

> Beam performance degraded significantly since 2.4
> -
>
> Key: BEAM-4750
> URL: https://issues.apache.org/jira/browse/BEAM-4750
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.4.0, 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0
>Reporter: Vojtech Janota
>Assignee: Jean-Baptiste Onofré
>Priority: Critical
>
> Starting from Beam 2.4 onwards the *InMemoryStateInternals* class in the 
> *beam-runners-core-java* module does an expensive Coder encode/decode 
> operation when copying object state. This has significant impact on 
> performance and pipelines that previously took low minutes do not finish 
> within hours in our case. Based on the discussion on the dev mailing list, 
> the main motivation for this change was to enforce Coder sanity, something 
> that should arguably remain within the realm of the DirectRunner and should 
> not leak into the core layer.
> Links to commits that introduced the new behaviour:
>  * [https://github.com/apache/beam/commit/32a427c]
>  * [https://github.com/apache/beam/commit/8151d82]
>  
> Additional details and surrounding discussion can be found here:
>  * 
> https://lists.apache.org/thread.html/1e329318a4dafe27b9ff304d9460d05d8966e5ceebaf4ebfb948e2b8@%3Cdev.beam.apache.org%3E
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4518) Errors when running Python Game stats with a low fixed_window_duration in the DirectRunner

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732381#comment-16732381
 ] 

Kenneth Knowles commented on BEAM-4518:
---

Do we know more about this? If it is a flake, I am going to alter its priority 
and tagging.

> Errors when running Python Game stats with a low fixed_window_duration in the 
> DirectRunner
> --
>
> Key: BEAM-4518
> URL: https://issues.apache.org/jira/browse/BEAM-4518
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Ahmet Altay
>Assignee: Charles Chen
>Priority: Major
> Fix For: 2.10.0
>
>
> Using the injector and the following command to start the DirectRunner 
> pipeline:
> python -m apache_beam.examples.complete.game.game_stats \
> --project=google.com:clouddfe \
> --topic projects/google.com:clouddfe/topics/leader_board-$USER-topic-1 \
> --dataset ${USER}_test --fixed_window_duration 1
> Fails with:
> ValueError: PCollection of size 2 with more than one element accessed as a 
> singleton view. First two elements encountered are "13.93", "11.78". 
> [while running 'CalculateSpammyUsers/ProcessAndFilter']
> Offending code is here:
> global_mean_score = (
>  sum_scores
>  | beam.Values()
>  | beam.CombineGlobally(beam.combiners.MeanCombineFn())\
>  .as_singleton_view())
>  # Filter the user sums using the global mean.
>  filtered = (
>  sum_scores
>  # Use the derived mean total score (global_mean_score) as a side input.
>  | 'ProcessAndFilter' >> beam.Filter(
>  lambda key_score, global_mean:\
>  key_score[1] > global_mean * self.SCORE_WEIGHT,
>  global_mean_score))
> Since global_mean_score is the result of CombineGlobally, this is either an 
> issue with CombineGlobally or side inputs implementation in DirectRunner. The 
> latter is more likely since it works on DataflowRunner.
> cc: [~mariagh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6030) Remove Graphite Metrics Sink specific methods from PipelineOptions

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732377#comment-16732377
 ] 

Kenneth Knowles commented on BEAM-6030:
---

If I understand correctly, this is in Java SDK core so I changed the component, 
and I removed the Fix Version so it will not block a release.

> Remove Graphite Metrics Sink specific methods from PipelineOptions
> --
>
> Key: BEAM-6030
> URL: https://issues.apache.org/jira/browse/BEAM-6030
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Federico Mendez
>Assignee: Etienne Chauchot
>Priority: Critical
>
> https://issues.apache.org/jira/browse/BEAM-4553 introduced methods in the 
> base PipelineOptions interfaces that are specific to Graphite 
> get/setMetricsGraphiteHost  and get/setMetricsGraphitePort.
> I'd suggest to those elements should be moved to a 
> PipelineOptionsGraphiteMetricsSink interfaces to avoid having technology 
> specific methods in base classes/interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4518) Errors when running Python Game stats with a low fixed_window_duration in the DirectRunner

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4518:
--
Priority: Critical  (was: Major)

> Errors when running Python Game stats with a low fixed_window_duration in the 
> DirectRunner
> --
>
> Key: BEAM-4518
> URL: https://issues.apache.org/jira/browse/BEAM-4518
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Ahmet Altay
>Assignee: Charles Chen
>Priority: Critical
>  Labels: flake
> Fix For: 2.10.0
>
>
> Using the injector and the following command to start the DirectRunner 
> pipeline:
> python -m apache_beam.examples.complete.game.game_stats \
> --project=google.com:clouddfe \
> --topic projects/google.com:clouddfe/topics/leader_board-$USER-topic-1 \
> --dataset ${USER}_test --fixed_window_duration 1
> Fails with:
> ValueError: PCollection of size 2 with more than one element accessed as a 
> singleton view. First two elements encountered are "13.93", "11.78". 
> [while running 'CalculateSpammyUsers/ProcessAndFilter']
> Offending code is here:
> global_mean_score = (
>  sum_scores
>  | beam.Values()
>  | beam.CombineGlobally(beam.combiners.MeanCombineFn())\
>  .as_singleton_view())
>  # Filter the user sums using the global mean.
>  filtered = (
>  sum_scores
>  # Use the derived mean total score (global_mean_score) as a side input.
>  | 'ProcessAndFilter' >> beam.Filter(
>  lambda key_score, global_mean:\
>  key_score[1] > global_mean * self.SCORE_WEIGHT,
>  global_mean_score))
> Since global_mean_score is the result of CombineGlobally, this is either an 
> issue with CombineGlobally or side inputs implementation in DirectRunner. The 
> latter is more likely since it works on DataflowRunner.
> cc: [~mariagh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6030) Remove Graphite Metrics Sink specific methods from PipelineOptions

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6030:
--
Component/s: (was: beam-model)
 sdk-java-core

> Remove Graphite Metrics Sink specific methods from PipelineOptions
> --
>
> Key: BEAM-6030
> URL: https://issues.apache.org/jira/browse/BEAM-6030
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.8.0
>Reporter: Federico Mendez
>Assignee: Etienne Chauchot
>Priority: Critical
>
> https://issues.apache.org/jira/browse/BEAM-4553 introduced methods in the 
> base PipelineOptions interfaces that are specific to Graphite 
> get/setMetricsGraphiteHost  and get/setMetricsGraphitePort.
> I'd suggest to those elements should be moved to a 
> PipelineOptionsGraphiteMetricsSink interfaces to avoid having technology 
> specific methods in base classes/interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6182) Use of conscrypt SSL results in stuck workflows in Dataflow

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732375#comment-16732375
 ] 

Kenneth Knowles commented on BEAM-6182:
---

Is there something we need to do and can do with very high confidence for 
including in 2.10.0?

> Use of conscrypt SSL results in stuck workflows in Dataflow
> ---
>
> Key: BEAM-6182
> URL: https://issues.apache.org/jira/browse/BEAM-6182
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Ahmet Altay
>Assignee: Tyler Akidau
>Priority: Blocker
> Fix For: 2.10.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> An experimental flag is being added to disable it for now with an option to 
> enable it per-workflow.
> Also related:
> https://issues.apache.org/jira/browse/BEAM-5747 - Upgrade conscrypt to its 
> latest version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6030) Remove Graphite Metrics Sink specific methods from PipelineOptions

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6030:
--
Fix Version/s: (was: 2.10.0)

> Remove Graphite Metrics Sink specific methods from PipelineOptions
> --
>
> Key: BEAM-6030
> URL: https://issues.apache.org/jira/browse/BEAM-6030
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Affects Versions: 2.8.0
>Reporter: Federico Mendez
>Assignee: Etienne Chauchot
>Priority: Critical
>
> https://issues.apache.org/jira/browse/BEAM-4553 introduced methods in the 
> base PipelineOptions interfaces that are specific to Graphite 
> get/setMetricsGraphiteHost  and get/setMetricsGraphitePort.
> I'd suggest to those elements should be moved to a 
> PipelineOptionsGraphiteMetricsSink interfaces to avoid having technology 
> specific methods in base classes/interfaces.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5723) CassandraIO is broken because of use of bad relocation of guava

2019-01-02 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732394#comment-16732394
 ] 

Kenneth Knowles commented on BEAM-5723:
---

Noting here that I agree this is good candidate for a release blocker for 
2.10.0. Is it true that this affects all versions after 2.5.0?

> CassandraIO is broken because of use of bad relocation of guava
> ---
>
> Key: BEAM-5723
> URL: https://issues.apache.org/jira/browse/BEAM-5723
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-cassandra
>Affects Versions: 2.5.0, 2.6.0, 2.7.0
>Reporter: Arun sethia
>Assignee: João Cabrita
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> While using apache beam to run dataflow job to read data from BigQuery and 
> Store/Write to Cassandra with following libaries:
>  # beam-sdks-java-io-cassandra - 2.6.0
>  # beam-sdks-java-io-jdbc - 2.6.0
>  # beam-sdks-java-io-google-cloud-platform - 2.6.0
>  # beam-sdks-java-core - 2.6.0
>  # google-cloud-dataflow-java-sdk-all - 2.5.0
>  # google-api-client -1.25.0
>  
> I am getting following error at the time insert/save data to Cassandra.
> {code:java}
> [error] (run-main-0) org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NoSuchMethodError: 
> com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NoSuchMethodError: 
> com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;
>  at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>  at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>  at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>  at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5723) CassandraIO is broken because of use of bad relocation of guava

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-5723:
--
Affects Version/s: 2.8.0
   2.9.0

> CassandraIO is broken because of use of bad relocation of guava
> ---
>
> Key: BEAM-5723
> URL: https://issues.apache.org/jira/browse/BEAM-5723
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-cassandra
>Affects Versions: 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0
>Reporter: Arun sethia
>Assignee: João Cabrita
>Priority: Major
> Fix For: 2.7.1, 2.10.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> While using apache beam to run dataflow job to read data from BigQuery and 
> Store/Write to Cassandra with following libaries:
>  # beam-sdks-java-io-cassandra - 2.6.0
>  # beam-sdks-java-io-jdbc - 2.6.0
>  # beam-sdks-java-io-google-cloud-platform - 2.6.0
>  # beam-sdks-java-core - 2.6.0
>  # google-cloud-dataflow-java-sdk-all - 2.5.0
>  # google-api-client -1.25.0
>  
> I am getting following error at the time insert/save data to Cassandra.
> {code:java}
> [error] (run-main-0) org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NoSuchMethodError: 
> com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
> java.lang.NoSuchMethodError: 
> com.datastax.driver.mapping.Mapper.saveAsync(Ljava/lang/Object;)Lorg/apache/beam/repackaged/beam_sdks_java_io_cassandra/com/google/common/util/concurrent/ListenableFuture;
>  at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332)
>  at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302)
>  at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197)
>  at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313)
>  at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6021) Registered more internal classes for kryo serialization

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6021:
--
Affects Version/s: 2.9.0

> Registered more internal classes for kryo serialization
> ---
>
> Key: BEAM-6021
> URL: https://issues.apache.org/jira/browse/BEAM-6021
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Affects Versions: 2.8.0, 2.9.0
>Reporter: Marek Simunek
>Assignee: Marek Simunek
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When using {{KryoSerializer}} we could improve serialization performance with 
> registering internal classes used in SparkRunner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6021) Registered more internal classes for kryo serialization

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6021:
--
Affects Version/s: (was: 2.9.0)
   (was: 2.8.0)

> Registered more internal classes for kryo serialization
> ---
>
> Key: BEAM-6021
> URL: https://issues.apache.org/jira/browse/BEAM-6021
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Marek Simunek
>Assignee: Marek Simunek
>Priority: Major
> Fix For: 2.10.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When using {{KryoSerializer}} we could improve serialization performance with 
> registering internal classes used in SparkRunner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-4518) Errors when running Python Game stats with a low fixed_window_duration in the DirectRunner

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-4518:
--
Labels: flake  (was: )

> Errors when running Python Game stats with a low fixed_window_duration in the 
> DirectRunner
> --
>
> Key: BEAM-4518
> URL: https://issues.apache.org/jira/browse/BEAM-4518
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.5.0, 2.6.0
>Reporter: Ahmet Altay
>Assignee: Charles Chen
>Priority: Major
>  Labels: flake
> Fix For: 2.10.0
>
>
> Using the injector and the following command to start the DirectRunner 
> pipeline:
> python -m apache_beam.examples.complete.game.game_stats \
> --project=google.com:clouddfe \
> --topic projects/google.com:clouddfe/topics/leader_board-$USER-topic-1 \
> --dataset ${USER}_test --fixed_window_duration 1
> Fails with:
> ValueError: PCollection of size 2 with more than one element accessed as a 
> singleton view. First two elements encountered are "13.93", "11.78". 
> [while running 'CalculateSpammyUsers/ProcessAndFilter']
> Offending code is here:
> global_mean_score = (
>  sum_scores
>  | beam.Values()
>  | beam.CombineGlobally(beam.combiners.MeanCombineFn())\
>  .as_singleton_view())
>  # Filter the user sums using the global mean.
>  filtered = (
>  sum_scores
>  # Use the derived mean total score (global_mean_score) as a side input.
>  | 'ProcessAndFilter' >> beam.Filter(
>  lambda key_score, global_mean:\
>  key_score[1] > global_mean * self.SCORE_WEIGHT,
>  global_mean_score))
> Since global_mean_score is the result of CombineGlobally, this is either an 
> issue with CombineGlobally or side inputs implementation in DirectRunner. The 
> latter is more likely since it works on DataflowRunner.
> cc: [~mariagh]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-6241) MongoDbIO - Add Limit and Aggregates Support

2019-01-02 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-6241:
--
Fix Version/s: 2.10.0

> MongoDbIO - Add Limit and Aggregates Support
> 
>
> Key: BEAM-6241
> URL: https://issues.apache.org/jira/browse/BEAM-6241
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-mongodb
>Affects Versions: 2.9.0
>Reporter: Ahmed El.Hussaini
>Assignee: Ahmed El.Hussaini
>Priority: Major
>  Labels: easyfix
> Fix For: 2.10.0
>
>
> h2. Adds Support to Limit Results
>  
> {code:java}
> MongoDbIO.read()
> .withUri("mongodb://localhost:" + port)
> .withDatabase(DATABASE)
> .withCollection(COLLECTION)
> .withFilter("{\"scientist\":\"Einstein\"}")
> .withLimit(5));{code}
> h2. Adds Support to Use Aggregates
>  
> {code:java}
> List aggregates = new ArrayList();
>   aggregates.add(
> new BsonDocument(
>   "$match",
>   new BsonDocument("country", new BsonDocument("$eq", new 
> BsonString("England");
> PCollection output =
>   pipeline.apply(
> MongoDbIO.read()
>   .withUri("mongodb://localhost:" + port)
>   .withDatabase(DATABASE)
>   .withCollection(COLLECTION)
>   .withAggregate(aggregates));
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   9   10   >