[jira] [Commented] (BEAM-4409) NoSuchMethodException reading from JmsIO

2020-01-07 Thread Neil Kolban (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010378#comment-17010378
 ] 

Neil Kolban commented on BEAM-4409:
---

What is the status of this issue?  I see its been open for 20 months.  I too am 
suffering with the exact same issue.

> NoSuchMethodException reading from JmsIO
> 
>
> Key: BEAM-4409
> URL: https://issues.apache.org/jira/browse/BEAM-4409
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.4.0
> Environment: Linux, Java 1.8, Beam 2.4, Direct Runner, ActiveMQ
>Reporter: Edward Pricer
>Priority: Major
>
> Running with the DirectRunner, and reading from a queue with JmsIO as an 
> unbounded source will produce a NoSuchMethodException. This occurs as the 
> UnboundedReadEvaluatorFactory.UnboundedReadEvaluator attempts to clone the 
> JmsCheckpointMark with the default (Avro) coder.
> The following trivial code on the reader side reproduces the error 
> (DirectRunner must be in path). The messages on the queue for this test case 
> were simple TextMessages. I found this exception is triggered more readily 
> when messages are published rapidly (~200/second)
> {code:java}
> Pipeline p = 
> Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
> // read from the queue
> ConnectionFactory factory = new
> ActiveMQConnectionFactory("tcp://localhost:61616");
> PCollection inputStrings = p.apply("Read from queue",
> JmsIO.readMessage() .withConnectionFactory(factory)
> .withQueue("somequeue") .withCoder(StringUtf8Coder.of())
> .withMessageMapper((JmsIO.MessageMapper) message ->
> ((TextMessage) message).getText()));
> // decode 
> PCollection asStrings = inputStrings.apply("Decode Message", 
> ParDo.of(new DoFn() { @ProcessElement public
> void processElement(ProcessContext context) {
> System.out.println(context.element());
> context.output(context.element()); } })); 
> p.run();
> {code}
> Stack trace:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.NoSuchMethodException: javax.jms.Message.() at 
> org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:353) at 
> org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:369) at 
> org.apache.avro.reflect.ReflectData.newRecord(ReflectData.java:901) at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:212)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.reflect.ReflectDatumReader.readCollection(ReflectDatumReader.java:219)
>  at 
> org.apache.avro.reflect.ReflectDatumReader.readArray(ReflectDatumReader.java:137)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177)
>  at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:302)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) 
> at org.apache.beam.sdk.coders.AvroCoder.decode(AvroCoder.java:318) at 
> org.apache.beam.sdk.coders.Coder.decode(Coder.java:170) at 
> org.apache.beam.sdk.util.CoderUtils.decodeFromSafeStream(CoderUtils.java:122) 
> at 
> org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:105) 
> at 
> org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:99) 
> at org.apache.beam.sdk.util.CoderUtils.clone(CoderUtils.java:148) at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:194)
>  at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:124)
>  at 
> org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:161)
>  at 
> org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:125)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> java.lang.NoSuchMethodException: javax.jms.Message.() at 
> java.lang.Class.getConstructor0(Class.java:3082) at 
> java.lang.Class.getDeclaredConstructor(Class.java:2178) at 
> 

[jira] [Work logged] (BEAM-6857) Support dynamic timers

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6857?focusedWorklogId=367966=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367966
 ]

ASF GitHub Bot logged work on BEAM-6857:


Author: ASF GitHub Bot
Created on: 08/Jan/20 03:52
Start Date: 08/Jan/20 03:52
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #10506: [BEAM-6857] 
OnTimer/SetTimer Signature Updates
URL: https://github.com/apache/beam/pull/10506
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367966)
Time Spent: 9h 50m  (was: 9h 40m)

> Support dynamic timers
> --
>
> Key: BEAM-6857
> URL: https://issues.apache.org/jira/browse/BEAM-6857
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> The Beam timers API currently requires each timer to be statically specified 
> in the DoFn. The user must provide a separate callback method per timer. For 
> example:
>  
> {code:java}
> DoFn()
> {   
>   @TimerId("timer1") 
>   private final TimerSpec timer1 = TimerSpecs.timer(...);   
>   @TimerId("timer2") 
>   private final TimerSpec timer2 = TimerSpecs.timer(...);                 
>   .. set timers in processElement    
>   @OnTimer("timer1") 
>   public void onTimer1() { .}
>   @OnTimer("timer2") 
>   public void onTimer2() {}
> }
> {code}
>  
> However there are many cases where the user does not know the set of timers 
> statically when writing their code. This happens when the timer tag should be 
> based on the data. It also happens when writing a DSL on top of Beam, where 
> the DSL author has to create DoFns but does not know statically which timers 
> their users will want to set (e.g. Scio).
>  
> The goal is to support dynamic timers. Something as follows;
>  
> {code:java}
> DoFn() 
> {
>   @TimerId("timer") 
>   private final TimerSpec timer1 = TimerSpecs.dynamicTimer(...);
>   @ProcessElement process(@TimerId("timer") DynamicTimer timer)
>   {
>        timer.set("tag1'", ts);       
>timer.set("tag2", ts);     
>   }
>   @OnTimer("timer") 
>   public void onTimer1(@TimerTag String tag) { .}
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=367961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367961
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 08/Jan/20 03:41
Start Date: 08/Jan/20 03:41
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #10474: 
[BEAM-8932] [BEAM-9036] Revert reverted commit to use PubsubMessage as the 
canonical type in beam client
URL: https://github.com/apache/beam/pull/10474#discussion_r364052652
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubJsonClient.java
 ##
 @@ -171,7 +172,12 @@ public int publish(TopicPath topic, List 
outgoingMessages) thro
 List incomingMessages = new 
ArrayList<>(response.getReceivedMessages().size());
 for (ReceivedMessage message : response.getReceivedMessages()) {
   PubsubMessage pubsubMessage = message.getMessage();
-  @Nullable Map attributes = pubsubMessage.getAttributes();
+  Map attributes;
+  if (pubsubMessage.getAttributes() != null) {
+attributes = pubsubMessage.getAttributes();
+  } else {
+attributes = new HashMap<>();
 
 Review comment:
   This would be more efficient and clear as `Collections.emptyMap()` which is 
immutable and does not involve allocation.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367961)
Time Spent: 9h 10m  (was: 9h)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9030) Bump grpc to 1.26.0

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9030?focusedWorklogId=367959=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367959
 ]

ASF GitHub Bot logged work on BEAM-9030:


Author: ASF GitHub Bot
Created on: 08/Jan/20 03:39
Start Date: 08/Jan/20 03:39
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on issue #10463: [BEAM-9030] 
Bump grpc to 1.26.0
URL: https://github.com/apache/beam/pull/10463#issuecomment-571877625
 
 
   I have updated the PR accordingly, would be great if you have another look :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367959)
Time Spent: 5h 10m  (was: 5h)

> Bump grpc to 1.26.0
> ---
>
> Key: BEAM-9030
> URL: https://issues.apache.org/jira/browse/BEAM-9030
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When submitting a Python word count job to a Flink session/standalone cluster 
> repeatedly, the meta space usage of the task manager of the Flink cluster 
> will continuously increase (about 40MB each time). The reason is that the 
> Beam classes are loaded with the user class loader in Flink and there are 
> problems with the implementation of `ProcessManager`(from Beam) and 
> `ThreadPoolCache`(from netty) which may cause the user class loader could not 
> be garbage collected even after the job finished which causes the meta space 
> memory leak eventually. You can refer to FLINK-15338[1] for more information.
> Regarding to `ProcessManager`, I have created a JIRA BEAM-9006[2] to track 
> it. Regarding to `ThreadPoolCache`, it is a Netty problem and has been fixed 
> in NETTY#8955[3]. Netty 4.1.35 Final has already included this fix and GRPC 
> 1.22.0 has already dependents on Netty 4.1.35 Final. So we need to bump the 
> version of GRPC to 1.22.0+ (currently 1.21.0).
>  
> What do you think?
> [1] https://issues.apache.org/jira/browse/FLINK-15338
> [2] https://issues.apache.org/jira/browse/BEAM-9006
> [3] [https://github.com/netty/netty/pull/8955]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=367949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367949
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 08/Jan/20 03:16
Start Date: 08/Jan/20 03:16
Worklog Time Spent: 10m 
  Work Description: dpcollins-google commented on issue #10476: 
[BEAM-8932][Cleanup] Move external PubsubIO hooks outside of PubsubIO.
URL: https://github.com/apache/beam/pull/10476#issuecomment-571873039
 
 
   Just rebased this. It should be clearer now that this is just cleanup.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367949)
Time Spent: 9h  (was: 8h 50m)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8953) Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8953?focusedWorklogId=367948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367948
 ]

ASF GitHub Bot logged work on BEAM-8953:


Author: ASF GitHub Bot
Created on: 08/Jan/20 03:15
Start Date: 08/Jan/20 03:15
Worklog Time Spent: 10m 
  Work Description: RyanBerti commented on issue #10360: [BEAM-8953] Extend 
ParquetIO read builders for AvroParquetReader
URL: https://github.com/apache/beam/pull/10360#issuecomment-571872888
 
 
   @aromanenko-dev added javadocs - let me know if it's too wordy / not wordy 
enough?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367948)
Time Spent: 4h 20m  (was: 4h 10m)

> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model
> -
>
> Key: BEAM-8953
> URL: https://issues.apache.org/jira/browse/BEAM-8953
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Affects Versions: 2.16.0
>Reporter: Ryan Berti
>Assignee: Ryan Berti
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> When utilizing ParquetIO to deserialize objects into case classes in Scala, 
> we'd like to utilize a downstream converter which takes GenericRecords and 
> converts them to instances of our case classes, rather than relying on 
> ParquetIO to deserialize into the case class via reflection + implementing 
> the IndexedRecord interface.
> The ParquetIO.Read / ParquetIO.ReadFiles Builders currently support a 
> filepattern + schema / schema arguments respectively. When using the Read / 
> ReadFiles Builders with these arguments, the underlying AvroParquetReader 
> object that gets created in the ParquetIO.ReadFiles.ReadFn method defaults to 
> utilizing an AvroReadSupport instance whose GenericData model gets set to 
> SpecificData. We'd like to have the the underlying AvroReadSupport utilize 
> the GenericData model, but there's currently no way to force this to happen 
> via the existing ParquetIO Read / ReadFiles builders. 
> I'd like to extend the ParquetIO Read / ReadFiles builders to support a new 
> method allowing users to define a GenericData model, which will then be 
> passed into the AvroParquetReader builder. I've tested and validated that 
> this method allows ParquetIO to generate GenericRecord instances without 
> requiring that the users classes can be reflectively instantiated and 
> initialized via the IndexedRecord interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9064) Add pytype to lint checks

2020-01-07 Thread Chad Dombrova (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010307#comment-17010307
 ] 

Chad Dombrova commented on BEAM-9064:
-

Thoughts (which I will repeat in the mailing list discussion):

I've gotten mypy completely passing on my own branch. It was a lot of work and 
the longer it lingers the more work it becomes.  So let's focus our efforts on 
getting that merged and integrated into the lint jobs _first._  From there we 
will learn a few things:  1) How difficult it is for end users to keep the 
tests passing (i.e. the time and frustration levels with mypy alone), and 2) 
how much more work it will be to also get pytype passing (i.e. how divergent 
the two tools are).  Based on that feedback we can decide whether we want a 
pytype failure to _prevent_ a PR from merging, or merely be informative 
(failures could be something that the google team tracks on the side through 
metrics/analytics). 

tl;dr   If both tools are actually quite similar, then it shouldn't be much 
more of a burden for users to maintain both.  But if they're not, or users are 
already struggling with mypy on its own, then we should hold off introducing 
pytype as a requirement.

 

> Add pytype to lint checks
> -
>
> Key: BEAM-9064
> URL: https://issues.apache.org/jira/browse/BEAM-9064
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [~chadrik]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8953) Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8953?focusedWorklogId=367936=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367936
 ]

ASF GitHub Bot logged work on BEAM-8953:


Author: ASF GitHub Bot
Created on: 08/Jan/20 02:53
Start Date: 08/Jan/20 02:53
Worklog Time Spent: 10m 
  Work Description: RyanBerti commented on pull request #10360: [BEAM-8953] 
Extend ParquetIO read builders for AvroParquetReader
URL: https://github.com/apache/beam/pull/10360#discussion_r364045131
 
 

 ##
 File path: 
sdks/java/io/parquet/src/test/java/org/apache/beam/sdk/io/parquet/ParquetIOTest.java
 ##
 @@ -126,4 +128,57 @@ public void testReadDisplayData() {
 
 Assert.assertThat(displayData, hasDisplayItem("filePattern", 
"foo.parquet"));
   }
+
+  public static class TestRecord {
+String name;
+
+public TestRecord(String name) {
+  this.name = name;
+}
+  }
+
+  @Test(expected = 
org.apache.beam.sdk.Pipeline.PipelineExecutionException.class)
 
 Review comment:
   For schemas that include a namespace and name that map back to a valid class 
in the classpath, the AvroRecordConverter.start() method will attempt to create 
an instance of the given class when the SpecificData model is in use (this 
doesn't happen when the name/namespace aren't set). So yes - when the schema is 
obtained with ReflectData, the name and namespace would be 'TestRecord' and 
'beam.sdks.java.io.parquet.test', and since this class doesn't implement a 
no-arg constructor and the other methods expected by SpecificData the runner 
ends up throwing a NoSuchMethodException. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367936)
Time Spent: 3h 50m  (was: 3h 40m)

> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model
> -
>
> Key: BEAM-8953
> URL: https://issues.apache.org/jira/browse/BEAM-8953
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Affects Versions: 2.16.0
>Reporter: Ryan Berti
>Assignee: Ryan Berti
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> When utilizing ParquetIO to deserialize objects into case classes in Scala, 
> we'd like to utilize a downstream converter which takes GenericRecords and 
> converts them to instances of our case classes, rather than relying on 
> ParquetIO to deserialize into the case class via reflection + implementing 
> the IndexedRecord interface.
> The ParquetIO.Read / ParquetIO.ReadFiles Builders currently support a 
> filepattern + schema / schema arguments respectively. When using the Read / 
> ReadFiles Builders with these arguments, the underlying AvroParquetReader 
> object that gets created in the ParquetIO.ReadFiles.ReadFn method defaults to 
> utilizing an AvroReadSupport instance whose GenericData model gets set to 
> SpecificData. We'd like to have the the underlying AvroReadSupport utilize 
> the GenericData model, but there's currently no way to force this to happen 
> via the existing ParquetIO Read / ReadFiles builders. 
> I'd like to extend the ParquetIO Read / ReadFiles builders to support a new 
> method allowing users to define a GenericData model, which will then be 
> passed into the AvroParquetReader builder. I've tested and validated that 
> this method allows ParquetIO to generate GenericRecord instances without 
> requiring that the users classes can be reflectively instantiated and 
> initialized via the IndexedRecord interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8953) Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8953?focusedWorklogId=367937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367937
 ]

ASF GitHub Bot logged work on BEAM-8953:


Author: ASF GitHub Bot
Created on: 08/Jan/20 02:53
Start Date: 08/Jan/20 02:53
Worklog Time Spent: 10m 
  Work Description: RyanBerti commented on pull request #10360: [BEAM-8953] 
Extend ParquetIO read builders for AvroParquetReader
URL: https://github.com/apache/beam/pull/10360#discussion_r364045168
 
 

 ##
 File path: 
sdks/java/io/parquet/src/main/java/org/apache/beam/sdk/io/parquet/ParquetIO.java
 ##
 @@ -165,6 +171,10 @@ public Read from(String filepattern) {
   return from(ValueProvider.StaticValueProvider.of(filepattern));
 }
 
+public Read withAvroDataModel(GenericData model) {
 
 Review comment:
   Will do.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367937)
Time Spent: 4h  (was: 3h 50m)

> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model
> -
>
> Key: BEAM-8953
> URL: https://issues.apache.org/jira/browse/BEAM-8953
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Affects Versions: 2.16.0
>Reporter: Ryan Berti
>Assignee: Ryan Berti
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> When utilizing ParquetIO to deserialize objects into case classes in Scala, 
> we'd like to utilize a downstream converter which takes GenericRecords and 
> converts them to instances of our case classes, rather than relying on 
> ParquetIO to deserialize into the case class via reflection + implementing 
> the IndexedRecord interface.
> The ParquetIO.Read / ParquetIO.ReadFiles Builders currently support a 
> filepattern + schema / schema arguments respectively. When using the Read / 
> ReadFiles Builders with these arguments, the underlying AvroParquetReader 
> object that gets created in the ParquetIO.ReadFiles.ReadFn method defaults to 
> utilizing an AvroReadSupport instance whose GenericData model gets set to 
> SpecificData. We'd like to have the the underlying AvroReadSupport utilize 
> the GenericData model, but there's currently no way to force this to happen 
> via the existing ParquetIO Read / ReadFiles builders. 
> I'd like to extend the ParquetIO Read / ReadFiles builders to support a new 
> method allowing users to define a GenericData model, which will then be 
> passed into the AvroParquetReader builder. I've tested and validated that 
> this method allows ParquetIO to generate GenericRecord instances without 
> requiring that the users classes can be reflectively instantiated and 
> initialized via the IndexedRecord interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8953) Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8953?focusedWorklogId=367938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367938
 ]

ASF GitHub Bot logged work on BEAM-8953:


Author: ASF GitHub Bot
Created on: 08/Jan/20 02:53
Start Date: 08/Jan/20 02:53
Worklog Time Spent: 10m 
  Work Description: RyanBerti commented on pull request #10360: [BEAM-8953] 
Extend ParquetIO read builders for AvroParquetReader
URL: https://github.com/apache/beam/pull/10360#discussion_r364045197
 
 

 ##
 File path: 
sdks/java/io/parquet/src/main/java/org/apache/beam/sdk/io/parquet/ParquetIO.java
 ##
 @@ -192,21 +202,40 @@ public void populateDisplayData(DisplayData.Builder 
builder) {
 @Nullable
 abstract Schema getSchema();
 
+@Nullable
+abstract GenericData getAvroDataModel();
+
+abstract Builder toBuilder();
+
 @AutoValue.Builder
 abstract static class Builder {
   abstract Builder setSchema(Schema schema);
 
+  abstract Builder setAvroDataModel(GenericData model);
+
   abstract ReadFiles build();
 }
 
+public ReadFiles withAvroDataModel(GenericData model) {
 
 Review comment:
   Will do.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367938)
Time Spent: 4h 10m  (was: 4h)

> Extend ParquetIO.Read/ReadFiles.Builder to support Avro GenericData model
> -
>
> Key: BEAM-8953
> URL: https://issues.apache.org/jira/browse/BEAM-8953
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Affects Versions: 2.16.0
>Reporter: Ryan Berti
>Assignee: Ryan Berti
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> When utilizing ParquetIO to deserialize objects into case classes in Scala, 
> we'd like to utilize a downstream converter which takes GenericRecords and 
> converts them to instances of our case classes, rather than relying on 
> ParquetIO to deserialize into the case class via reflection + implementing 
> the IndexedRecord interface.
> The ParquetIO.Read / ParquetIO.ReadFiles Builders currently support a 
> filepattern + schema / schema arguments respectively. When using the Read / 
> ReadFiles Builders with these arguments, the underlying AvroParquetReader 
> object that gets created in the ParquetIO.ReadFiles.ReadFn method defaults to 
> utilizing an AvroReadSupport instance whose GenericData model gets set to 
> SpecificData. We'd like to have the the underlying AvroReadSupport utilize 
> the GenericData model, but there's currently no way to force this to happen 
> via the existing ParquetIO Read / ReadFiles builders. 
> I'd like to extend the ParquetIO Read / ReadFiles builders to support a new 
> method allowing users to define a GenericData model, which will then be 
> passed into the AvroParquetReader builder. I've tested and validated that 
> this method allows ParquetIO to generate GenericRecord instances without 
> requiring that the users classes can be reflectively instantiated and 
> initialized via the IndexedRecord interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9064) Add pytype to lint checks

2020-01-07 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010295#comment-17010295
 ] 

Udi Meiri commented on BEAM-9064:
-

I understand that, but what do we do about the internal users that I (and 
others) support that already use a different tool?
Let's discuss this with everyone on the dev mailing list.


> Add pytype to lint checks
> -
>
> Key: BEAM-9064
> URL: https://issues.apache.org/jira/browse/BEAM-9064
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [~chadrik]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=367935=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367935
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 08/Jan/20 02:49
Start Date: 08/Jan/20 02:49
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #10474: [BEAM-8932] 
[BEAM-9036] Revert reverted commit to use PubsubMessage as the canonical type 
in beam client
URL: https://github.com/apache/beam/pull/10474
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367935)
Time Spent: 8h 50m  (was: 8h 40m)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8624) Implement FnService for status api in Dataflow runner

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8624?focusedWorklogId=367925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367925
 ]

ASF GitHub Bot logged work on BEAM-8624:


Author: ASF GitHub Bot
Created on: 08/Jan/20 02:30
Start Date: 08/Jan/20 02:30
Worklog Time Spent: 10m 
  Work Description: y1chi commented on issue #10115: [BEAM-8624] Implement 
Worker Status FnService in Dataflow runner
URL: https://github.com/apache/beam/pull/10115#issuecomment-571863883
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367925)
Time Spent: 14h 10m  (was: 14h)

> Implement FnService for status api in Dataflow runner
> -
>
> Key: BEAM-8624
> URL: https://issues.apache.org/jira/browse/BEAM-8624
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Major
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9064) Add pytype to lint checks

2020-01-07 Thread Chad Dombrova (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010258#comment-17010258
 ] 

Chad Dombrova commented on BEAM-9064:
-

I am _very_ reticent to implement this.  I haven't used pytype, but my 
experience working with mypy over the past few years, and following various 
issues and peps related to it and typing in general,  has taught me there's 
still a lot of room for interpretation and thus variation between type 
checkers.  As a user, it can be quite challenging to solve certain typing 
issues, and I would not be the least bit surprised to find that certain 
problems cannot be solved in a way that satisfies both linters, at least not 
without some seriously ugly workarounds.  We're already asking for our 
contributors to gain a whole new area of expertise in order to get their PRs 
merged – one with a fairly steep learning curve –  I wouldn't want to burden 
them with an additional linter with its own idiosyncrasies.  

 

> Add pytype to lint checks
> -
>
> Key: BEAM-9064
> URL: https://issues.apache.org/jira/browse/BEAM-9064
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [~chadrik]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=367917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367917
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 08/Jan/20 02:02
Start Date: 08/Jan/20 02:02
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10474: [BEAM-8932] 
[BEAM-9036] Revert reverted commit to use PubsubMessage as the canonical type 
in beam client
URL: https://github.com/apache/beam/pull/10474#issuecomment-571857843
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367917)
Time Spent: 8h 40m  (was: 8.5h)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8968) portableWordCount test for Spark/Flink failing: jar not found

2020-01-07 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-8968:
--
Description: 
This affects portableWordCountSparkRunnerBatch, 
portableWordCountFlinkRunnerBatch, and portableWordCountFlinkRunnerStreaming.

22:43:23 RuntimeError: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib/runners/flink/1.9/job-server/build/libs/beam-runners-flink-1.9-job-server-2.19.0-SNAPSHOT.jar
 not found. Please build the server with 
22:43:23   cd 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib;
 ./gradlew runners:flink:1.9:job-server:shadowJar

Note: the error was obeserved on pull/10378, and not on master. 

  was:
This affects portableWordCountSparkRunnerBatch, 
portableWordCountFlinkRunnerBatch, and portableWordCountFlinkRunnerStreaming.

22:43:23 RuntimeError: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib/runners/flink/1.9/job-server/build/libs/beam-runners-flink-1.9-job-server-2.19.0-SNAPSHOT.jar
 not found. Please build the server with 
22:43:23   cd 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib;
 ./gradlew runners:flink:1.9:job-server:shadowJar



> portableWordCount test for Spark/Flink failing: jar not found
> -
>
> Key: BEAM-8968
> URL: https://issues.apache.org/jira/browse/BEAM-8968
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Labels: currently-failing, portability-flink, portability-spark, 
> test-failure
>
> This affects portableWordCountSparkRunnerBatch, 
> portableWordCountFlinkRunnerBatch, and portableWordCountFlinkRunnerStreaming.
> 22:43:23 RuntimeError: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib/runners/flink/1.9/job-server/build/libs/beam-runners-flink-1.9-job-server-2.19.0-SNAPSHOT.jar
>  not found. Please build the server with 
> 22:43:23   cd 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib;
>  ./gradlew runners:flink:1.9:job-server:shadowJar
> Note: the error was obeserved on pull/10378, and not on master. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9064) Add pytype to lint checks

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9064?focusedWorklogId=367915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367915
 ]

ASF GitHub Bot logged work on BEAM-9064:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:58
Start Date: 08/Jan/20 01:58
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #10528: [BEAM-9064] Add 
pytype checks to tox
URL: https://github.com/apache/beam/pull/10528
 
 
   Usage: `tox -e py37-pytype`
   
   Still a WIP:
   - TODOs need to be resolved.
   - Lots of warnings disabled for now in setup.cfg.
   - Needs to be added to lint job.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 

[jira] [Assigned] (BEAM-8968) portableWordCount test for Spark/Flink failing: jar not found

2020-01-07 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-8968:
-

Assignee: Valentyn Tymofieiev  (was: Kyle Weaver)

> portableWordCount test for Spark/Flink failing: jar not found
> -
>
> Key: BEAM-8968
> URL: https://issues.apache.org/jira/browse/BEAM-8968
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Labels: currently-failing, portability-flink, portability-spark, 
> test-failure
>
> This affects portableWordCountSparkRunnerBatch, 
> portableWordCountFlinkRunnerBatch, and portableWordCountFlinkRunnerStreaming.
> 22:43:23 RuntimeError: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib/runners/flink/1.9/job-server/build/libs/beam-runners-flink-1.9-job-server-2.19.0-SNAPSHOT.jar
>  not found. Please build the server with 
> 22:43:23   cd 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib;
>  ./gradlew runners:flink:1.9:job-server:shadowJar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8968) portableWordCount test for Spark/Flink failing: jar not found

2020-01-07 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010251#comment-17010251
 ] 

Valentyn Tymofieiev commented on BEAM-8968:
---

Verified that the error is caused by my PR. Removing the line 
"project.installGcpTest.mustRunAfter project.configurations.distTarBall"  [1] 
somehow affects the dependency chain for portable flink runner tasks, in a way 
that  runners:flink:1.9:job-server:shadowJar is not running before  
portableWordCountFlinkRunnerBatch or something like that. Looking further.

[1] 
https://github.com/apache/beam/blob/8aba30f8c77c0a563d375452185732f73e870f0d/buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy#L1818

> portableWordCount test for Spark/Flink failing: jar not found
> -
>
> Key: BEAM-8968
> URL: https://issues.apache.org/jira/browse/BEAM-8968
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink, runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: currently-failing, portability-flink, portability-spark, 
> test-failure
>
> This affects portableWordCountSparkRunnerBatch, 
> portableWordCountFlinkRunnerBatch, and portableWordCountFlinkRunnerStreaming.
> 22:43:23 RuntimeError: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib/runners/flink/1.9/job-server/build/libs/beam-runners-flink-1.9-job-server-2.19.0-SNAPSHOT.jar
>  not found. Please build the server with 
> 22:43:23   cd 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/build/gradleenv/2022703441/lib;
>  ./gradlew runners:flink:1.9:job-server:shadowJar



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9064) Add pytype to lint checks

2020-01-07 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-9064:

Status: Open  (was: Triage Needed)

> Add pytype to lint checks
> -
>
> Key: BEAM-9064
> URL: https://issues.apache.org/jira/browse/BEAM-9064
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>
> [~chadrik]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9064) Add pytype to lint checks

2020-01-07 Thread Udi Meiri (Jira)
Udi Meiri created BEAM-9064:
---

 Summary: Add pytype to lint checks
 Key: BEAM-9064
 URL: https://issues.apache.org/jira/browse/BEAM-9064
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core, testing
Reporter: Udi Meiri
Assignee: Udi Meiri


[~chadrik]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev closed BEAM-9062.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9061) New release of pyhamcrest==1.10.0 breaks portable Python precommits.

2020-01-07 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev closed BEAM-9061.
-
Fix Version/s: Not applicable
   Resolution: Fixed

> New release of pyhamcrest==1.10.0 breaks portable Python precommits.
> 
>
> Key: BEAM-9061
> URL: https://issues.apache.org/jira/browse/BEAM-9061
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {noformat}
> 08:35:26   File "apache_beam/runners/portability/fn_api_runner_test.py", line 
> 38, in 
> 08:35:26 import hamcrest  # pylint: disable=ungrouped-imports
> 08:35:26   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/hamcrest/__init__.py",
>  line 2, in 
> 08:35:26 from hamcrest.library import *
> 08:35:26   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/hamcrest/library/__init__.py",
>  line 7, in 
> 08:35:26 from hamcrest.library.object import *
> 08:35:26   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/hamcrest/library/object/__init__.py",
>  line 4, in 
> 08:35:26 from .hasproperty import has_properties, has_property
> 08:35:26   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python2_PVR_Flink_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/hamcrest/library/object/hasproperty.py",
>  line 174
> 08:35:26 ),
> 08:35:26 ^
> {noformat}
> rootcause: https://github.com/hamcrest/PyHamcrest/issues/131



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367904=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367904
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:15
Start Date: 08/Jan/20 01:15
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #10492: [BEAM-9041, 
BEAM-9042] SchemaCoder equals should not rely on from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10492#issuecomment-571847154
 
 
   Not at all, sorry I didn't approve sooner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367904)
Time Spent: 3h 20m  (was: 3h 10m)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367903
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:06
Start Date: 08/Jan/20 01:06
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844874
 
 
   Run JavaBeamZetaSQL PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367903)
Time Spent: 9.5h  (was: 9h 20m)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367902
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:06
Start Date: 08/Jan/20 01:06
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844810
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367902)
Time Spent: 9h 20m  (was: 9h 10m)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367901=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367901
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:05
Start Date: 08/Jan/20 01:05
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844556
 
 
   run precommits
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367901)
Time Spent: 9h 10m  (was: 9h)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367899=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367899
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:05
Start Date: 08/Jan/20 01:05
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844653
 
 
   please test this
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367899)
Time Spent: 8h 50m  (was: 8h 40m)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367900
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:05
Start Date: 08/Jan/20 01:05
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844653
 
 
   please test this
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367900)
Time Spent: 9h  (was: 8h 50m)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367898=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367898
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:05
Start Date: 08/Jan/20 01:05
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844556
 
 
   run precommits
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367898)
Time Spent: 8h 40m  (was: 8.5h)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8844) [SQL] Create performance tests for BigQueryTable

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8844?focusedWorklogId=367897=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367897
 ]

ASF GitHub Bot logged work on BEAM-8844:


Author: ASF GitHub Bot
Created on: 08/Jan/20 01:04
Start Date: 08/Jan/20 01:04
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #10226: [BEAM-8844] Add a 
new Jenkins job for SQL perf tests
URL: https://github.com/apache/beam/pull/10226#issuecomment-571844420
 
 
   All the tests 404
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367897)
Time Spent: 8.5h  (was: 8h 20m)

> [SQL] Create performance tests for BigQueryTable
> 
>
> Key: BEAM-8844
> URL: https://issues.apache.org/jira/browse/BEAM-8844
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> They should measure read-time for:
>  * DIRECT_READ w/o push-down
>  * DIRECT_READ w/ push-down
>  * DEFAULT



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8413) test_streaming_pipeline_returns_expected_user_metrics_fnapi_it failed on latest PostCommit Py36

2020-01-07 Thread Ankur Goenka (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010224#comment-17010224
 ] 

Ankur Goenka commented on BEAM-8413:


The latest runs are passing 
[https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5455/testReport/apache_beam.runners.dataflow.dataflow_exercise_streaming_metrics_pipeline_test/ExerciseStreamingMetricsPipelineTest/]
 
[https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5455/testReport/apache_beam.runners.dataflow.dataflow_exercise_streaming_metrics_pipeline_test/ExerciseStreamingMetricsPipelineTest/]
 

and older runs were cleaned.

> test_streaming_pipeline_returns_expected_user_metrics_fnapi_it  failed on 
> latest PostCommit Py36 
> -
>
> Key: BEAM-8413
> URL: https://issues.apache.org/jira/browse/BEAM-8413
> Project: Beam
>  Issue Type: New Feature
>  Components: test-failures
>Reporter: Boyuan Zhang
>Assignee: Ankur Goenka
>Priority: Major
>
> https://builds.apache.org/job/beam_PostCommit_Python36/731/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8458) BigQueryIO.Read needs permissions to create datasets to be able to run queries

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8458?focusedWorklogId=367895=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367895
 ]

ASF GitHub Bot logged work on BEAM-8458:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:57
Start Date: 08/Jan/20 00:57
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #9852: [BEAM-8458] Add 
option to set temp dataset in BigQueryIO.Read
URL: https://github.com/apache/beam/pull/9852#issuecomment-571842474
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367895)
Time Spent: 0.5h  (was: 20m)

> BigQueryIO.Read needs permissions to create datasets to be able to run queries
> --
>
> Key: BEAM-8458
> URL: https://issues.apache.org/jira/browse/BEAM-8458
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Israel Herraiz
>Assignee: Israel Herraiz
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When using {{fromQuery}}, BigQueryIO creates a temp dataset to store the 
> results of the query.
> Therefore, Beam requires permissions to create datasets just to be able to 
> run a query. In practice, this means that Beam requires the role 
> bigQuery.User just to run queries, whereas if you use {{from}} (to read from 
> a table), the role bigQuery.jobUser suffices.
> BigQueryIO.Read should have an option to set an existing dataset  to write 
> the temp results of
>  a query, so it would be enough with having the role bigQuery.jobUser.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8945) DirectStreamObserver race condition

2020-01-07 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-8945:
---
Description: 
The DirectStreamObserver can get into a dead lock if the channel become 
unhealthy of is not ready. An extended period of unhealthyness should result 
into failure.

This is supported by following thread dumps where we see that 1 thread is 
having on getting the lock on actual stream observer while the remaining worker 
threads are waiting on the lock on the stream observer.
 The thread which is having lock on stream observer is probably in the while 
loop because the outboundObserver is not ready.
 Their is also 1 thread which is waiting to execute onError which means that 
the stream observer has become unhealthy and probably never going to get ready.

100s of threads are blocked on:

 
 
org.apache.beam.sdk.fn.stream.SynchronizedStreamObserver.onNext(SynchronizedStreamObserver.java:46)
 
org.apache.beam.runners.fnexecution.control.FnApiControlClient.handle(FnApiControlClient.java:84)
 
org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.getProcessBundleProgress(RegisterAndProcessBundleOperation.java:393)
 
org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor$SingularProcessBundleProgressTracker.updateProgress(BeamFnMapTaskExecutor.java:347)
 
org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor$SingularProcessBundleProgressTracker.periodicProgressUpdate(BeamFnMapTaskExecutor.java:334)
 
org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor$SingularProcessBundleProgressTracker$$Lambda$107/1297335196.run(Unknown
 Source)
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)

 

 

One thread having the lock:

State: TIMED_WAITING stack: —
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
 java.util.concurrent.Phaser$QNode.block(Phaser.java:1142)
 java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
 java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
 java.util.concurrent.Phaser.awaitAdvanceInterruptibly(Phaser.java:796)
 
org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(DirectStreamObserver.java:70)
 
org.apache.beam.sdk.fn.stream.SynchronizedStreamObserver.onNext(SynchronizedStreamObserver.java:46)
 
org.apache.beam.runners.fnexecution.control.FnApiControlClient.handle(FnApiControlClient.java:84)
 
org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.getProcessBundleProgress(RegisterAndProcessBundleOperation.java:393)
 
org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor$SingularProcessBundleProgressTracker.updateProgress(BeamFnMapTaskExecutor.java:347)
 
org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor$SingularProcessBundleProgressTracker.periodicProgressUpdate(BeamFnMapTaskExecutor.java:334)
 
org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor$SingularProcessBundleProgressTracker$$Lambda$107/1297335196.run(Unknown
 Source)
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745)

 

 

One thread waiting to execute onError

State: BLOCKED stack: —
 
org.apache.beam.sdk.fn.stream.SynchronizedStreamObserver.onError(SynchronizedStreamObserver.java:53)
 
org.apache.beam.runners.fnexecution.control.FnApiControlClient.closeAndTerminateOutstandingRequests(FnApiControlClient.java:117)
 
org.apache.beam.runners.fnexecution.control.FnApiControlClient.access$300(FnApiControlClient.java:49)
 
org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onError(FnApiControlClient.java:174)
 
org.apache.beam.vendor.grpc.v1p21p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onCancel(ServerCalls.java:270)
 

[jira] [Updated] (BEAM-6499) Support HDFS for artifact staging with Flink Runner

2020-01-07 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-6499:
---
Issue Type: New Feature  (was: Bug)

> Support HDFS for artifact staging with Flink Runner
> ---
>
> Key: BEAM-6499
> URL: https://issues.apache.org/jira/browse/BEAM-6499
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>
> Based on the PR [https://github.com/apache/beam/pull/5806/]
> We should enable HDFS for artifact staging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6499) Support HDFS for artifact staging with Flink Runner

2020-01-07 Thread Ankur Goenka (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010221#comment-17010221
 ] 

Ankur Goenka commented on BEAM-6499:


We have introduced uber jar mode of pipeline submission which eliminates the 
need for having a separate dfs to distribute files.

 

Changing the bug to feature request.

> Support HDFS for artifact staging with Flink Runner
> ---
>
> Key: BEAM-6499
> URL: https://issues.apache.org/jira/browse/BEAM-6499
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>
> Based on the PR [https://github.com/apache/beam/pull/5806/]
> We should enable HDFS for artifact staging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-5156) Apache Beam on dataflow runner can't find Tensorflow for workers

2020-01-07 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-5156.

Fix Version/s: (was: 2.6.0)
   (was: 2.5.0)
   2.17.0
   Resolution: Fixed

> Apache Beam on dataflow runner can't find Tensorflow for workers
> 
>
> Key: BEAM-5156
> URL: https://issues.apache.org/jira/browse/BEAM-5156
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
> Environment: google cloud compute instance running linux
>Reporter: Thomas Johns
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: triaged
> Fix For: 2.17.0
>
>
> Adding serialized tensorflow model to apache beam pipeline with python sdk 
> but it can not find any version of tensorflow when applied to dataflow runner 
> although it is not a problem locally. Tried various versions of tensorflow 
> from 1.6 to 1.10. I thought it might be a conflicting package some where so I 
> removed all other packages and tried to just install tensorflow and same 
> problem.
> Could not find a version that satisfies the requirement tensorflow==1.6.0 
> (from -r reqtest.txt (line 59)) (from versions: )No matching distribution 
> found for tensorflow==1.6.0 (from -r reqtest.txt (line 59))



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-5156) Apache Beam on dataflow runner can't find Tensorflow for workers

2020-01-07 Thread Ankur Goenka (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010220#comment-17010220
 ] 

Ankur Goenka commented on BEAM-5156:


We have moved to an unbounded thread factory and this issue does not require 
any additional arguments to be passed.

> Apache Beam on dataflow runner can't find Tensorflow for workers
> 
>
> Key: BEAM-5156
> URL: https://issues.apache.org/jira/browse/BEAM-5156
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
> Environment: google cloud compute instance running linux
>Reporter: Thomas Johns
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: triaged
> Fix For: 2.5.0, 2.6.0
>
>
> Adding serialized tensorflow model to apache beam pipeline with python sdk 
> but it can not find any version of tensorflow when applied to dataflow runner 
> although it is not a problem locally. Tried various versions of tensorflow 
> from 1.6 to 1.10. I thought it might be a conflicting package some where so I 
> removed all other packages and tried to just install tensorflow and same 
> problem.
> Could not find a version that satisfies the requirement tensorflow==1.6.0 
> (from -r reqtest.txt (line 59)) (from versions: )No matching distribution 
> found for tensorflow==1.6.0 (from -r reqtest.txt (line 59))



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7243) Release python{36,37}-fnapi containers images required by Dataflow runner.

2020-01-07 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-7243.

Fix Version/s: 2.16.0
   Resolution: Fixed

> Release python{36,37}-fnapi containers images required by Dataflow runner. 
> ---
>
> Key: BEAM-7243
> URL: https://issues.apache.org/jira/browse/BEAM-7243
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Valentyn Tymofieiev
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We have not yet released all container images for currently used dev fnapi 
> containers [1]. 
> For example:
> :~$ docker pull 
> gcr.io/cloud-dataflow/v1beta3/python36-fnapi:beam-master-20190213
> Error response from daemon: manifest for 
> gcr.io/cloud-dataflow/v1beta3/python36-fnapi:beam-master-20190213 not found
> This causes failures in Python streaming postcommit tests on Dataflow runner 
> for Python 3.6 and higher versions. 
> We need to release the containers and update names.py.
> [~angoenka], I think you were planning an update of dev containers that will 
> also take care of this. If you don't plan to do that or need help, please 
> reassign the issue back to me. Thanks!
> [1] 
> https://github.com/apache/beam/blob/79a463784fce36c12292b4e642238ef124c184e0/sdks/python/apache_beam/runners/dataflow/internal/names.py#L44



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7243) Release python{36,37}-fnapi containers images required by Dataflow runner.

2020-01-07 Thread Ankur Goenka (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010217#comment-17010217
 ] 

Ankur Goenka commented on BEAM-7243:


The PR was committed and the latest container is available for pull 

docker pull gcr.io/cloud-dataflow/v1beta3/python36-fnapi:beam-master-20191220

> Release python{36,37}-fnapi containers images required by Dataflow runner. 
> ---
>
> Key: BEAM-7243
> URL: https://issues.apache.org/jira/browse/BEAM-7243
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Valentyn Tymofieiev
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> We have not yet released all container images for currently used dev fnapi 
> containers [1]. 
> For example:
> :~$ docker pull 
> gcr.io/cloud-dataflow/v1beta3/python36-fnapi:beam-master-20190213
> Error response from daemon: manifest for 
> gcr.io/cloud-dataflow/v1beta3/python36-fnapi:beam-master-20190213 not found
> This causes failures in Python streaming postcommit tests on Dataflow runner 
> for Python 3.6 and higher versions. 
> We need to release the containers and update names.py.
> [~angoenka], I think you were planning an update of dev containers that will 
> also take care of this. If you don't plan to do that or need help, please 
> reassign the issue back to me. Thanks!
> [1] 
> https://github.com/apache/beam/blob/79a463784fce36c12292b4e642238ef124c184e0/sdks/python/apache_beam/runners/dataflow/internal/names.py#L44



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8932) Expose complete Cloud Pub/Sub messages through PubsubIO API

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8932?focusedWorklogId=367889=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367889
 ]

ASF GitHub Bot logged work on BEAM-8932:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:38
Start Date: 08/Jan/20 00:38
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10474: [BEAM-8932] 
[BEAM-9036] Revert reverted commit to use PubsubMessage as the canonical type 
in beam client
URL: https://github.com/apache/beam/pull/10474#issuecomment-571838267
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367889)
Time Spent: 8.5h  (was: 8h 20m)

> Expose complete Cloud Pub/Sub messages through PubsubIO API
> ---
>
> Key: BEAM-8932
> URL: https://issues.apache.org/jira/browse/BEAM-8932
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Daniel Collins
>Assignee: Daniel Collins
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> The PubsubIO API only exposes a subset of the fields in the underlying 
> PubsubMessage protocol buffer. To accomodate future feature changes as well 
> as for greater compatability with code using the Cloud Pub/Sub apis, a method 
> to read and write these protocol messages should be exposed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6857) Support dynamic timers

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6857?focusedWorklogId=367887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367887
 ]

ASF GitHub Bot logged work on BEAM-6857:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:36
Start Date: 08/Jan/20 00:36
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #10506: [BEAM-6857] 
OnTimer/SetTimer Signature Updates
URL: https://github.com/apache/beam/pull/10506#issuecomment-571837773
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367887)
Time Spent: 9h 40m  (was: 9.5h)

> Support dynamic timers
> --
>
> Key: BEAM-6857
> URL: https://issues.apache.org/jira/browse/BEAM-6857
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> The Beam timers API currently requires each timer to be statically specified 
> in the DoFn. The user must provide a separate callback method per timer. For 
> example:
>  
> {code:java}
> DoFn()
> {   
>   @TimerId("timer1") 
>   private final TimerSpec timer1 = TimerSpecs.timer(...);   
>   @TimerId("timer2") 
>   private final TimerSpec timer2 = TimerSpecs.timer(...);                 
>   .. set timers in processElement    
>   @OnTimer("timer1") 
>   public void onTimer1() { .}
>   @OnTimer("timer2") 
>   public void onTimer2() {}
> }
> {code}
>  
> However there are many cases where the user does not know the set of timers 
> statically when writing their code. This happens when the timer tag should be 
> based on the data. It also happens when writing a DSL on top of Beam, where 
> the DSL author has to create DoFns but does not know statically which timers 
> their users will want to set (e.g. Scio).
>  
> The goal is to support dynamic timers. Something as follows;
>  
> {code:java}
> DoFn() 
> {
>   @TimerId("timer") 
>   private final TimerSpec timer1 = TimerSpecs.dynamicTimer(...);
>   @ProcessElement process(@TimerId("timer") DynamicTimer timer)
>   {
>        timer.set("tag1'", ts);       
>timer.set("tag2", ts);     
>   }
>   @OnTimer("timer") 
>   public void onTimer1(@TimerTag String tag) { .}
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7427:
---
Summary: JmsCheckpointMark Avro Serialization issue with UnboundedSource  
(was: JmsCheckpointMark Avro Serialisation issue with UnboundedSource)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Mourad
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8717) Beam Dependency Update Request: org.apache.commons:commons-lang3

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8717?focusedWorklogId=367885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367885
 ]

ASF GitHub Bot logged work on BEAM-8717:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:30
Start Date: 08/Jan/20 00:30
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10524: [BEAM-8717] Update 
commons-lang3 to version 3.9
URL: https://github.com/apache/beam/pull/10524#issuecomment-571836403
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367885)
Time Spent: 0.5h  (was: 20m)

> Beam Dependency Update Request: org.apache.commons:commons-lang3
> 
>
> Key: BEAM-8717
> URL: https://issues.apache.org/jira/browse/BEAM-8717
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:43.060362 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:11:02.203215 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:32.152530 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:47.060229 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:09.857528 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:21.614448 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:59.144846 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8717) Beam Dependency Update Request: org.apache.commons:commons-lang3

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8717?focusedWorklogId=367882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367882
 ]

ASF GitHub Bot logged work on BEAM-8717:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:29
Start Date: 08/Jan/20 00:29
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10524: [BEAM-8717] Update 
commons-lang3 to version 3.9
URL: https://github.com/apache/beam/pull/10524#issuecomment-571836403
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367882)
Time Spent: 20m  (was: 10m)

> Beam Dependency Update Request: org.apache.commons:commons-lang3
> 
>
> Key: BEAM-8717
> URL: https://issues.apache.org/jira/browse/BEAM-8717
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:43.060362 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:11:02.203215 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:32.152530 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:47.060229 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:09.857528 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:21.614448 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:59.144846 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9013) Multi-output TestStream breaks the DataflowRunner

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9013:
---
Fix Version/s: (was: 2.18.0)
   2.19.0

> Multi-output TestStream breaks the DataflowRunner
> -
>
> Key: BEAM-9013
> URL: https://issues.apache.org/jira/browse/BEAM-9013
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.17.0
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9042) AvroUtils.schemaCoder(schema) produces a not serializable SchemaCoder

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9042:
---
Description: 
After some recent change in the implementation of AvroUtils.schemaCoder(schema) 
the produced SchemaCoder is not serializable.

You can reproduce this by doing this:
{code:java}
final SchemaCoder avroSchemaCoder = 
AvroUtils.schemaCoder(schema);
CoderProperties.coderSerializable(avroSchemaCoder);{code}
it produces this exception
{code:java}
unable to serialize SchemaCoder avroSchemaCoder = 
AvroUtils.schemaCoder(schema);
 CoderProperties.coderSerializable(avroSchemaCoder);{code}
it produces this exception
{code:java}
unable to serialize SchemaCoder AvroUtils.schemaCoder(schema) produces a not serializable SchemaCoder
> -
>
> Key: BEAM-9042
> URL: https://issues.apache.org/jira/browse/BEAM-9042
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.18.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.18.0
>
>
> After some recent change in the implementation of 
> AvroUtils.schemaCoder(schema) the produced SchemaCoder is not serializable.
> You can reproduce this by doing this:
> {code:java}
> final SchemaCoder avroSchemaCoder = 
> AvroUtils.schemaCoder(schema);
> CoderProperties.coderSerializable(avroSchemaCoder);{code}
> it produces this exception
> {code:java}
> unable to serialize SchemaCoder Field{name=bool, description=, type=FieldType{typeName=BOOLEAN, 
> nullable=false, logicalType=null, collectionElementType=null, 
> mapKeyType=null, mapValueType=null, rowSchema=null, metadata={}}}
> Field{name=int, description=, type=FieldType{typeName=INT32, nullable=false, 
> logicalType=null, collectionElementType=null, mapKeyType=null, 
> mapValueType=null, rowSchema=null, metadata={}}}
>  UUID: 6a1ff5b7-e3be-42c3-9b36-f8b53d487fcd delegateCoder: 
> org.apache.beam.sdk.coders.Coder$ByteBuddy$LzAYzILR@5f8ca63c
> java.lang.IllegalArgumentException: unable to serialize SchemaCoder Fields:
> Field{name=bool, description=, type=FieldType{typeName=BOOLEAN, 
> nullable=false, logicalType=null, collectionElementType=null, 
> mapKeyType=null, mapValueType=null, rowSchema=null, metadata={}}}
> Field{name=int, description=, type=FieldType{typeName=INT32, nullable=false, 
> logicalType=null, collectionElementType=null, mapKeyType=null, 
> mapValueType=null, rowSchema=null, metadata={}}}
>  UUID: 6a1ff5b7-e3be-42c3-9b36-f8b53d487fcd delegateCoder: 
> org.apache.beam.sdk.coders.Coder$ByteBuddy$LzAYzILR@5f8ca63c
>  at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:55)
>  at 
> org.apache.beam.sdk.util.SerializableUtils.clone(SerializableUtils.java:113)
>  at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:92)
>  at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:131)
>  at 
> org.apache.beam.sdk.testing.CoderProperties.coderSerializable(CoderProperties.java:181)
>  at 
> org.apache.beam.sdk.schemas.utils.AvroUtilsTest.testAvroSchemaCoders(AvroUtilsTest.java:543)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>  at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>  at 
> 

[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367881
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:26
Start Date: 08/Jan/20 00:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10492: [BEAM-9041, 
BEAM-9042] SchemaCoder equals should not rely on from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10492#issuecomment-571835615
 
 
   Sorry if I rushed a bit the merge @TheNeuralBit I just wanted to cherry pick 
it to unblock 2.18.0 release.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367881)
Time Spent: 3h 10m  (was: 3h)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010205#comment-17010205
 ] 

Ismaël Mejía commented on BEAM-9041:


Yes, this is a regression I just opened a PR to cherry pick the fix now that it 
is merged into master. PTAL [~udim] 
https://github.com/apache/beam/pull/10526

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367880
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:23
Start Date: 08/Jan/20 00:23
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10526: 
[release-2.18.0][BEAM-9041, BEAM-9042] SchemaCoder equals should not rely on 
from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10526
 
 
   Cherry picks the fixes for both pending issues.
   
   R: @udim 
   CC: @TheNeuralBit 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367880)
Time Spent: 3h  (was: 2h 50m)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367879=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367879
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:17
Start Date: 08/Jan/20 00:17
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10492: [BEAM-9041, 
BEAM-9042] SchemaCoder equals should not rely on from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10492#issuecomment-571833418
 
 
   Merging now that the tests are green. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367879)
Time Spent: 2h 50m  (was: 2h 40m)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367878
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:17
Start Date: 08/Jan/20 00:17
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10492: [BEAM-9041, 
BEAM-9042] SchemaCoder equals should not rely on from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10492
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367878)
Time Spent: 2h 40m  (was: 2.5h)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9046) Kafka connector for Python throws ClassCastException when reading KafkaRecord

2020-01-07 Thread Ankur Goenka (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010192#comment-17010192
 ] 

Ankur Goenka commented on BEAM-9046:


cc: [~mxm] [~chamikara]

> Kafka connector for Python throws ClassCastException when reading KafkaRecord
> -
>
> Key: BEAM-9046
> URL: https://issues.apache.org/jira/browse/BEAM-9046
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-kafka
>Affects Versions: 2.16.0
>Reporter: Berkay Öztürk
>Priority: Major
>  Labels: KafkaIO, Python
>
>  I'm trying to read the data streaming from Apache Kafka using the Python SDK 
> for Apache Beam with the Flink runner. After running Kafka 2.4.0 and Flink 
> 1.8.3, I follow these steps:
>  * Compile and run Beam 2.16 with Flink 1.8 runner.
> {code:java}
> git clone --single-branch --branch release-2.16.0 
> https://github.com/apache/beam.git beam-2.16.0
> cd beam-2.16.0
> nohup ./gradlew :runners:flink:1.8:job-server:runShadow 
> -PflinkMasterUrl=localhost:8081 &
> {code}
>  * Run the Python pipeline.
> {code:python}
> from apache_beam import Pipeline
> from apache_beam.io.external.kafka import ReadFromKafka
> from apache_beam.options.pipeline_options import PipelineOptions
> if __name__ == '__main__':
> with Pipeline(options=PipelineOptions([
> '--runner=FlinkRunner',
> '--flink_version=1.8',
> '--flink_master_url=localhost:8081',
> '--environment_type=LOOPBACK',
> '--streaming'
> ])) as pipeline:
> (
> pipeline
> | 'read' >> ReadFromKafka({'bootstrap.servers': 
> 'localhost:9092'}, ['test'])  # [BEAM-3788] ???
> )
> result = pipeline.run()
> result.wait_until_finish()
> {code}
>  * Publish some data to Kafka.
> {code:java}
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
> >{"hello":"world!"}
> {code}
> The Python script throws this error:
> {code:java}
> [flink-runner-job-invoker] ERROR 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation - Error 
> during job invocation BeamApp-USER-somejob. 
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: xxx)
> at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
> at 
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483)
> at 
> org.apache.beam.runners.flink.FlinkExecutionEnvironments$BeamFlinkRemoteStreamEnvironment.executeRemotely(FlinkExecutionEnvironments.java:360)
> at 
> org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:310)
> at 
> org.apache.beam.runners.flink.FlinkStreamingPortablePipelineTranslator$StreamingTranslationContext.execute(FlinkStreamingPortablePipelineTranslator.java:173)
> at 
> org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:104)
> at 
> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:80)
> at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:78)
> at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
> at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
> at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
> at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
> at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
> ... 13 more
> Caused by: java.lang.ClassCastException: 
> org.apache.beam.sdk.io.kafka.KafkaRecord cannot be cast to [B
> at 
> org.apache.beam.sdk.coders.ByteArrayCoder.encode(ByteArrayCoder.java:41)
> at 
> org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:56)
> at 
> org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:105)
> at 
> org.apache.beam.sdk.values.ValueWithRecordId$ValueWithRecordIdCoder.encode(ValueWithRecordId.java:81)

[jira] [Work logged] (BEAM-8933) BigQuery IO should support read/write in Arrow format

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8933?focusedWorklogId=367877=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367877
 ]

ASF GitHub Bot logged work on BEAM-8933:


Author: ASF GitHub Bot
Created on: 08/Jan/20 00:12
Start Date: 08/Jan/20 00:12
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on pull request #10369: [BEAM-8933] 
BigQueryIO Arrow for read
URL: https://github.com/apache/beam/pull/10369#discussion_r364012529
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageQuerySource.java
 ##
 @@ -57,6 +59,32 @@
 priority,
 location,
 kmsKey,
+format,
+parseFn,
+outputCoder,
+bqServices);
+  }
+
+  public static  BigQueryStorageQuerySource create(
 
 Review comment:
   Do users depend on these constructors in any way? We don't want to break 
backwards compatibility.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367877)
Time Spent: 4h 40m  (was: 4.5h)

> BigQuery IO should support read/write in Arrow format
> -
>
> Key: BEAM-8933
> URL: https://issues.apache.org/jira/browse/BEAM-8933
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> As of right now BigQuery uses Avro format for reading and writing.
> We should add a config to BigQueryIO to specify which format to use: Arrow or 
> Avro (with Avro as default).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8716) Beam Dependency Update Request: org.apache.commons:commons-csv

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8716?focusedWorklogId=367875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367875
 ]

ASF GitHub Bot logged work on BEAM-8716:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:58
Start Date: 07/Jan/20 23:58
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10523: [BEAM-8716] Update 
commons-csv to version 1.7
URL: https://github.com/apache/beam/pull/10523#issuecomment-571828871
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367875)
Time Spent: 1h 10m  (was: 1h)

> Beam Dependency Update Request: org.apache.commons:commons-csv
> --
>
> Key: BEAM-8716
> URL: https://issues.apache.org/jira/browse/BEAM-8716
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:39.656647 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:10:54.344035 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:24.533576 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:39.621510 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:02.313131 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:13.948482 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:51.471993 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8716) Beam Dependency Update Request: org.apache.commons:commons-csv

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8716?focusedWorklogId=367873=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367873
 ]

ASF GitHub Bot logged work on BEAM-8716:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:58
Start Date: 07/Jan/20 23:58
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10523: [BEAM-8716] Update 
commons-csv to version 1.7
URL: https://github.com/apache/beam/pull/10523#issuecomment-571828902
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367873)
Time Spent: 50m  (was: 40m)

> Beam Dependency Update Request: org.apache.commons:commons-csv
> --
>
> Key: BEAM-8716
> URL: https://issues.apache.org/jira/browse/BEAM-8716
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:39.656647 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:10:54.344035 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:24.533576 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:39.621510 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:02.313131 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:13.948482 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:51.471993 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8716) Beam Dependency Update Request: org.apache.commons:commons-csv

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8716?focusedWorklogId=367874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367874
 ]

ASF GitHub Bot logged work on BEAM-8716:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:58
Start Date: 07/Jan/20 23:58
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10523: [BEAM-8716] Update 
commons-csv to version 1.7
URL: https://github.com/apache/beam/pull/10523#issuecomment-571828902
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367874)
Time Spent: 1h  (was: 50m)

> Beam Dependency Update Request: org.apache.commons:commons-csv
> --
>
> Key: BEAM-8716
> URL: https://issues.apache.org/jira/browse/BEAM-8716
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:39.656647 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:10:54.344035 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:24.533576 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:39.621510 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:02.313131 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:13.948482 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:51.471993 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8716) Beam Dependency Update Request: org.apache.commons:commons-csv

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8716?focusedWorklogId=367871=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367871
 ]

ASF GitHub Bot logged work on BEAM-8716:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:58
Start Date: 07/Jan/20 23:58
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10523: [BEAM-8716] Update 
commons-csv to version 1.7
URL: https://github.com/apache/beam/pull/10523#issuecomment-571828871
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367871)
Time Spent: 40m  (was: 0.5h)

> Beam Dependency Update Request: org.apache.commons:commons-csv
> --
>
> Key: BEAM-8716
> URL: https://issues.apache.org/jira/browse/BEAM-8716
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:39.656647 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:10:54.344035 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:24.533576 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:39.621510 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:02.313131 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:13.948482 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:51.471993 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-csv. 
> The current version is 1.4. The latest version is 1.7 
> cc: [~kenn], [~kedin], [~apilloud], [~amaliujia], [~mingmxu], 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8575) Add more Python validates runner tests

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8575?focusedWorklogId=367868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367868
 ]

ASF GitHub Bot logged work on BEAM-8575:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:49
Start Date: 07/Jan/20 23:49
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10277: [BEAM-8575] 
Reenable passing VR tests.
URL: https://github.com/apache/beam/pull/10277#issuecomment-571826889
 
 
   Run Python 3.5 Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367868)
Time Spent: 41h 40m  (was: 41.5h)

> Add more Python validates runner tests
> --
>
> Key: BEAM-8575
> URL: https://issues.apache.org/jira/browse/BEAM-8575
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: wendy liu
>Assignee: wendy liu
>Priority: Major
>  Time Spent: 41h 40m
>  Remaining Estimate: 0h
>
> This is the umbrella issue to track the work of adding more Python tests to 
> improve test coverage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8575) Add more Python validates runner tests

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8575?focusedWorklogId=367867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367867
 ]

ASF GitHub Bot logged work on BEAM-8575:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:49
Start Date: 07/Jan/20 23:49
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10277: [BEAM-8575] 
Reenable passing VR tests.
URL: https://github.com/apache/beam/pull/10277#issuecomment-571826819
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367867)
Time Spent: 41.5h  (was: 41h 20m)

> Add more Python validates runner tests
> --
>
> Key: BEAM-8575
> URL: https://issues.apache.org/jira/browse/BEAM-8575
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: wendy liu
>Assignee: wendy liu
>Priority: Major
>  Time Spent: 41.5h
>  Remaining Estimate: 0h
>
> This is the umbrella issue to track the work of adding more Python tests to 
> improve test coverage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-7861) Make it easy to change between multi-process and multi-thread mode for Python Direct runners

2020-01-07 Thread Hannah Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-7861 started by Hannah Jiang.
--
> Make it easy to change between multi-process and multi-thread mode for Python 
> Direct runners
> 
>
> Key: BEAM-7861
> URL: https://issues.apache.org/jira/browse/BEAM-7861
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>
> BEAM-3645 makes it possible to run a map task parallel.
> However, users need to change runner when switch between multithreading and 
> multiprocessing mode.
> We want to add a flag (ex: --use-multiprocess) to make the switch easy 
> without changing the runner each time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8575) Add more Python validates runner tests

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8575?focusedWorklogId=367866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367866
 ]

ASF GitHub Bot logged work on BEAM-8575:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:48
Start Date: 07/Jan/20 23:48
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10277: [BEAM-8575] 
Reenable passing VR tests.
URL: https://github.com/apache/beam/pull/10277#issuecomment-571826748
 
 
   Run Python Spark ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367866)
Time Spent: 41h 20m  (was: 41h 10m)

> Add more Python validates runner tests
> --
>
> Key: BEAM-8575
> URL: https://issues.apache.org/jira/browse/BEAM-8575
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: wendy liu
>Assignee: wendy liu
>Priority: Major
>  Time Spent: 41h 20m
>  Remaining Estimate: 0h
>
> This is the umbrella issue to track the work of adding more Python tests to 
> improve test coverage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8575) Add more Python validates runner tests

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8575?focusedWorklogId=367865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367865
 ]

ASF GitHub Bot logged work on BEAM-8575:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:48
Start Date: 07/Jan/20 23:48
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10277: [BEAM-8575] 
Reenable passing VR tests.
URL: https://github.com/apache/beam/pull/10277#issuecomment-571826697
 
 
   Run Python Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367865)
Time Spent: 41h 10m  (was: 41h)

> Add more Python validates runner tests
> --
>
> Key: BEAM-8575
> URL: https://issues.apache.org/jira/browse/BEAM-8575
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: wendy liu
>Assignee: wendy liu
>Priority: Major
>  Time Spent: 41h 10m
>  Remaining Estimate: 0h
>
> This is the umbrella issue to track the work of adding more Python tests to 
> improve test coverage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367863=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367863
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:40
Start Date: 07/Jan/20 23:40
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on pull request #10417: [BEAM-8993] 
[SQL] MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#discussion_r364004589
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbFilterTest.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.meta.provider.mongodb;
+
+import static 
org.apache.beam.sdk.extensions.sql.meta.provider.test.TestTableProvider.PUSH_DOWN_OPTION;
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.Matchers.instanceOf;
+
+import com.alibaba.fastjson.JSON;
+import org.apache.beam.repackaged.core.org.apache.commons.lang3.tuple.Pair;
+import org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv;
+import org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel;
+import org.apache.beam.sdk.extensions.sql.impl.rel.BeamRelNode;
+import org.apache.beam.sdk.extensions.sql.meta.Table;
+import org.apache.beam.sdk.extensions.sql.meta.provider.test.TestTableProvider;
+import 
org.apache.beam.sdk.extensions.sql.meta.provider.test.TestTableProvider.PushDownOptions;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.values.Row;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+@RunWith(JUnit4.class)
+public class MongoDbFilterTest {
+  private static final Schema BASIC_SCHEMA =
+  Schema.builder()
+  .addInt32Field("unused1")
+  .addInt32Field("id")
+  .addStringField("name")
+  .addInt16Field("unused2")
+  .addBooleanField("b")
+  .build();
+
+  private BeamSqlEnv sqlEnv;
+
+  @Rule public TestPipeline pipeline = TestPipeline.create();
+
+  @Before
+  public void buildUp() {
+TestTableProvider tableProvider = new TestTableProvider();
+Table table = getTable("TEST", PushDownOptions.NONE);
+tableProvider.createTable(table);
+tableProvider.addRows(
+table.getName(),
+row(BASIC_SCHEMA, 100, 1, "one", (short) 100, true),
+row(BASIC_SCHEMA, 200, 2, "two", (short) 200, false));
+
+sqlEnv =
+BeamSqlEnv.builder(tableProvider)
+.setPipelineOptions(PipelineOptionsFactory.create())
+.build();
+  }
+
+  @Test
+  public void testIsSupported() {
+ImmutableList> sqlQueries =
+ImmutableList.of(
+Pair.of("select * from TEST where unused1=100", true),
+Pair.of("select * from TEST where unused1 in (100, 200)", true),
+Pair.of("select * from TEST where b", true),
+Pair.of("select * from TEST where not b", true),
+// Nested conjunction and disjunction is supported as long as 
child operations are
+// supported.
+Pair.of(
+"select * from TEST where unused1>100 and unused1<=200 and 
id<>1 and (name='two' or id=2)",
+true),
+// RegEx matching push-down is not implemented at the moment.
+Pair.of("select * from TEST where name like 'o%e'", false),
+// Complex operations, which modify a field before a comparison 
are not supported.
+Pair.of("select * from TEST where unused1+10=110", false),
+// Since unused2 is of type `short`, it will be cast to int32, 
making this a complex
+// operation.
+Pair.of("select * from TEST where unused2=200", false),
+// Operations involving more than one 

[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367861
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:37
Start Date: 07/Jan/20 23:37
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on issue #10417: [BEAM-8993] [SQL] 
MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#issuecomment-571823986
 
 
   @TheNeuralBit 
   Could you re-run `java presubmit` and `sql postcommit` tests please?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367861)
Time Spent: 1h 10m  (was: 1h)

> [SQL] MongoDb should use predicate push-down
> 
>
> Key: BEAM-8993
> URL: https://issues.apache.org/jira/browse/BEAM-8993
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> * Add a MongoDbFilter class, implementing BeamSqlTableFilter.
>  ** Support simple comparison operations.
>  ** Support boolean field.
>  ** Support nested conjunction/disjunction.
>  * Update MongoDbTable#buildIOReader
>  ** Construct a push-down filter from RexNodes.
>  ** Set filter to FindQuery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367860
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:35
Start Date: 07/Jan/20 23:35
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on pull request #10417: [BEAM-8993] 
[SQL] MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#discussion_r364003184
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbFilter.java
 ##
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.meta.provider.mongodb;
+
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.AND;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.COMPARISON;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.OR;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTableFilter;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.type.SqlTypeName;
+
+public class MongoDbFilter implements BeamSqlTableFilter {
 
 Review comment:
   Moved `MongoDbFilter`.
   `MongoDbTable` feels a little convoluted, but not too bad, I guess.
   What do you think?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367860)
Time Spent: 1h  (was: 50m)

> [SQL] MongoDb should use predicate push-down
> 
>
> Key: BEAM-8993
> URL: https://issues.apache.org/jira/browse/BEAM-8993
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> * Add a MongoDbFilter class, implementing BeamSqlTableFilter.
>  ** Support simple comparison operations.
>  ** Support boolean field.
>  ** Support nested conjunction/disjunction.
>  * Update MongoDbTable#buildIOReader
>  ** Construct a push-down filter from RexNodes.
>  ** Set filter to FindQuery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367859
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 23:30
Start Date: 07/Jan/20 23:30
Worklog Time Spent: 10m 
  Work Description: 11moon11 commented on pull request #10417: [BEAM-8993] 
[SQL] MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#discussion_r364001928
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbFilter.java
 ##
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.meta.provider.mongodb;
+
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.AND;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.COMPARISON;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.OR;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTableFilter;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.type.SqlTypeName;
+
+public class MongoDbFilter implements BeamSqlTableFilter {
+  private List supported;
+  private List unsupported;
+
+  public MongoDbFilter(List predicateCNF) {
+supported = new ArrayList<>();
+unsupported = new ArrayList<>();
+
+for (RexNode node : predicateCNF) {
+  if (!node.getType().getSqlTypeName().equals(SqlTypeName.BOOLEAN)) {
+throw new RuntimeException(
+"Predicate node '"
++ node.getClass().getSimpleName()
++ "' should be a boolean expression, but was: "
++ node.getType().getSqlTypeName());
+  }
+
+  if (isSupported(node)) {
+supported.add(node);
+  } else {
+unsupported.add(node);
+  }
 
 Review comment:
   Makes sense, created a static initializer.
   Thanks for pointing this out!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367859)
Time Spent: 50m  (was: 40m)

> [SQL] MongoDb should use predicate push-down
> 
>
> Key: BEAM-8993
> URL: https://issues.apache.org/jira/browse/BEAM-8993
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> * Add a MongoDbFilter class, implementing BeamSqlTableFilter.
>  ** Support simple comparison operations.
>  ** Support boolean field.
>  ** Support nested conjunction/disjunction.
>  * Update MongoDbTable#buildIOReader
>  ** Construct a push-down filter from RexNodes.
>  ** Set filter to FindQuery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010152#comment-17010152
 ] 

Udi Meiri commented on BEAM-9041:
-

Is this a 2.18 blocker? 

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9042) AvroUtils.schemaCoder(schema) produces a not serializable SchemaCoder

2020-01-07 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010150#comment-17010150
 ] 

Udi Meiri commented on BEAM-9042:
-

Any updates on this? Is this a regression from 2.17?


> AvroUtils.schemaCoder(schema) produces a not serializable SchemaCoder
> -
>
> Key: BEAM-9042
> URL: https://issues.apache.org/jira/browse/BEAM-9042
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.18.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.18.0
>
>
> After some recent change in the implementation of 
> AvroUtils.schemaCoder(schema) the produced SchemaCoder is not serializable.
> You can reproduce this by doing this:
> {code:java}
> final SchemaCoder avroSchemaCoder = 
> AvroUtils.schemaCoder(schema);
>  CoderProperties.coderSerializable(avroSchemaCoder);{code}
> it produces this exception
> {code:java}
> unable to serialize SchemaCoder Field{name=bool, description=, type=FieldType{typeName=BOOLEAN, 
> nullable=false, logicalType=null, collectionElementType=null, 
> mapKeyType=null, mapValueType=null, rowSchema=null, metadata={}}}
> Field{name=int, description=, type=FieldType{typeName=INT32, nullable=false, 
> logicalType=null, collectionElementType=null, mapKeyType=null, 
> mapValueType=null, rowSchema=null, metadata={}}}
>  UUID: 6a1ff5b7-e3be-42c3-9b36-f8b53d487fcd delegateCoder: 
> org.apache.beam.sdk.coders.Coder$ByteBuddy$LzAYzILR@5f8ca63c
> java.lang.IllegalArgumentException: unable to serialize SchemaCoder Fields:
> Field{name=bool, description=, type=FieldType{typeName=BOOLEAN, 
> nullable=false, logicalType=null, collectionElementType=null, 
> mapKeyType=null, mapValueType=null, rowSchema=null, metadata={}}}
> Field{name=int, description=, type=FieldType{typeName=INT32, nullable=false, 
> logicalType=null, collectionElementType=null, mapKeyType=null, 
> mapValueType=null, rowSchema=null, metadata={}}}
>  UUID: 6a1ff5b7-e3be-42c3-9b36-f8b53d487fcd delegateCoder: 
> org.apache.beam.sdk.coders.Coder$ByteBuddy$LzAYzILR@5f8ca63c
>  at 
> org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:55)
>  at 
> org.apache.beam.sdk.util.SerializableUtils.clone(SerializableUtils.java:113)
>  at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:92)
>  at 
> org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:131)
>  at 
> org.apache.beam.sdk.testing.CoderProperties.coderSerializable(CoderProperties.java:181)
>  at 
> org.apache.beam.sdk.schemas.utils.AvroUtilsTest.testAvroSchemaCoders(AvroUtilsTest.java:543)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>  at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>  at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>  at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>  at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>  at 
> 

[jira] [Work logged] (BEAM-8452) TriggerLoadJobs.process in bigquery_file_loads schema is type str

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8452?focusedWorklogId=367851=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367851
 ]

ASF GitHub Bot logged work on BEAM-8452:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:57
Start Date: 07/Jan/20 22:57
Worklog Time Spent: 10m 
  Work Description: stale[bot] commented on issue #1: BEAM-8452 - 
TriggerLoadJobs.process in bigquery_file_loads schema is type str
URL: https://github.com/apache/beam/pull/1#issuecomment-571811818
 
 
   This pull request has been marked as stale due to 60 days of inactivity. It 
will be closed in 1 week if no further activity occurs. If you think that’s 
incorrect or this pull request requires a review, please simply write any 
comment. If closed, you can revive the PR at any time and @mention a reviewer 
or discuss it on the d...@beam.apache.org list. Thank you for your 
contributions.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367851)
Time Spent: 2h  (was: 1h 50m)

> TriggerLoadJobs.process in bigquery_file_loads schema is type str
> -
>
> Key: BEAM-8452
> URL: https://issues.apache.org/jira/browse/BEAM-8452
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.16.0
>Reporter: Noah Goodrich
>Assignee: Noah Goodrich
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
>  I've found a first issue with the BigQueryFileLoads Transform and the type 
> of the schema parameter.
> {code:java}
> Triggering job 
> beam_load_2019_10_11_140829_19_157670e4d458f0ff578fbe971a91b30a_1570802915 to 
> load data to BigQuery table   datasetId: 'pyr_monat_dev'
>  projectId: 'icentris-ml-dev'
>  tableId: 'tree_user_types'>.Schema: {"fields": [{"name": "id", "type": 
> "INTEGER", "mode": "required"}, {"name": "description", "type": "STRING", 
> "mode": "nullable"}]}. Additional parameters: {}
> Retry with exponential backoff: waiting for 4.875033410381894 seconds before 
> retrying _insert_load_job because we caught exception: 
> apitools.base.protorpclite.messages.ValidationError: Expected type  s 
> 'apache_beam.io.gcp.internal.clients.bigquery.bigquery_v2_messages.TableSchema'>
>  for field schema, found {"fields": [{"name": "id", "type": "INTEGER", 
> "mode": "required"}, {"name": "description", "type"
> : "STRING", "mode": "nullable"}]} (type )
>  Traceback for above exception (most recent call last):
>   File "/opt/conda/lib/python3.7/site-packages/apache_beam/utils/retry.py", 
> line 206, in wrapper
>     return fun(*args, **kwargs)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 344, in _insert_load_job
>     **additional_load_parameters
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 791, in __init__
>     setattr(self, name, value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 973, in __setattr__
>     object.__setattr__(self, name, value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1652, in __set__
>     super(MessageField, self).__set__(message_instance, value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1293, in __set__
>     value = self.validate(value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1400, in validate
>     return self.__validate(value, self.validate_element)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1358, in __validate
>     return validate_element(value)   
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1340, in validate_element
>     (self.type, name, value, type(value)))
>  
> {code}
>  
> The triggering code looks like this:
>  
> options.view_as(DebugOptions).experiments = ['use_beam_bq_sink']
>         # Save main session state so pickled functions and classes
>         # defined in __main__ can be unpickled
>         options.view_as(SetupOptions).save_main_session = True
>         custom_options = options.view_as(LoadSqlToBqOptions)
>         with beam.Pipeline(options=options) as p:
>             (p
>                 | "Initializing with empty collection" >> beam.Create([1])
>               

[jira] [Work logged] (BEAM-8717) Beam Dependency Update Request: org.apache.commons:commons-lang3

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8717?focusedWorklogId=367850=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367850
 ]

ASF GitHub Bot logged work on BEAM-8717:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:57
Start Date: 07/Jan/20 22:57
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10524: [BEAM-8717] 
Update commons-lang3 to version 3.9
URL: https://github.com/apache/beam/pull/10524
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367850)
Remaining Estimate: 0h
Time Spent: 10m

> Beam Dependency Update Request: org.apache.commons:commons-lang3
> 
>
> Key: BEAM-8717
> URL: https://issues.apache.org/jira/browse/BEAM-8717
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:43:43.060362 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:11:02.203215 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:32.152530 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:47.060229 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:09.857528 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:21.614448 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:59.144846 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367848
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:56
Start Date: 07/Jan/20 22:56
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10492: [BEAM-9041, 
BEAM-9042] SchemaCoder equals should not rely on from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10492#issuecomment-571811579
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367848)
Time Spent: 2h 20m  (was: 2h 10m)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9041) SchemaCoder equals should not rely on from/toRowFunction equality

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9041?focusedWorklogId=367849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367849
 ]

ASF GitHub Bot logged work on BEAM-9041:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:56
Start Date: 07/Jan/20 22:56
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10492: [BEAM-9041, 
BEAM-9042] SchemaCoder equals should not rely on from/toRowFunction equality
URL: https://github.com/apache/beam/pull/10492#issuecomment-571811579
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367849)
Time Spent: 2.5h  (was: 2h 20m)

> SchemaCoder equals should not rely on from/toRowFunction equality
> -
>
> Key: BEAM-9041
> URL: https://issues.apache.org/jira/browse/BEAM-9041
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
> Fix For: 2.18.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> SchemaCoder equals implementation relies on SerializableFunction equals 
> method, this is error-prone because users rarely implement the equals method 
> for a SerializableFunction. One alternative would be to rely on bytes 
> equality for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367844
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:54
Start Date: 07/Jan/20 22:54
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #10417: 
[BEAM-8993] [SQL] MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#discussion_r363946226
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbFilter.java
 ##
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.meta.provider.mongodb;
+
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.AND;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.COMPARISON;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.OR;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTableFilter;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.type.SqlTypeName;
+
+public class MongoDbFilter implements BeamSqlTableFilter {
+  private List supported;
+  private List unsupported;
+
+  public MongoDbFilter(List predicateCNF) {
+supported = new ArrayList<>();
+unsupported = new ArrayList<>();
+
+for (RexNode node : predicateCNF) {
+  if (!node.getType().getSqlTypeName().equals(SqlTypeName.BOOLEAN)) {
+throw new RuntimeException(
+"Predicate node '"
++ node.getClass().getSimpleName()
++ "' should be a boolean expression, but was: "
++ node.getType().getSqlTypeName());
+  }
+
+  if (isSupported(node)) {
+supported.add(node);
+  } else {
+unsupported.add(node);
+  }
 
 Review comment:
   You shouldn't do work in a constructor. I think this constructor should 
accept a supported and an unsupported `List` and maybe be private. There could 
be a static initializer that will accept a single list and partition it into 
supported and unsupported.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367844)
Time Spent: 20m  (was: 10m)

> [SQL] MongoDb should use predicate push-down
> 
>
> Key: BEAM-8993
> URL: https://issues.apache.org/jira/browse/BEAM-8993
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> * Add a MongoDbFilter class, implementing BeamSqlTableFilter.
>  ** Support simple comparison operations.
>  ** Support boolean field.
>  ** Support nested conjunction/disjunction.
>  * Update MongoDbTable#buildIOReader
>  ** Construct a push-down filter from RexNodes.
>  ** Set filter to FindQuery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367845
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:54
Start Date: 07/Jan/20 22:54
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #10417: 
[BEAM-8993] [SQL] MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#discussion_r363951251
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbFilterTest.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.meta.provider.mongodb;
+
+import static 
org.apache.beam.sdk.extensions.sql.meta.provider.test.TestTableProvider.PUSH_DOWN_OPTION;
+import static org.hamcrest.MatcherAssert.assertThat;
+import static org.hamcrest.Matchers.instanceOf;
+
+import com.alibaba.fastjson.JSON;
+import org.apache.beam.repackaged.core.org.apache.commons.lang3.tuple.Pair;
+import org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv;
+import org.apache.beam.sdk.extensions.sql.impl.rel.BeamCalcRel;
+import org.apache.beam.sdk.extensions.sql.impl.rel.BeamRelNode;
+import org.apache.beam.sdk.extensions.sql.meta.Table;
+import org.apache.beam.sdk.extensions.sql.meta.provider.test.TestTableProvider;
+import 
org.apache.beam.sdk.extensions.sql.meta.provider.test.TestTableProvider.PushDownOptions;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.values.Row;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+@RunWith(JUnit4.class)
+public class MongoDbFilterTest {
+  private static final Schema BASIC_SCHEMA =
+  Schema.builder()
+  .addInt32Field("unused1")
+  .addInt32Field("id")
+  .addStringField("name")
+  .addInt16Field("unused2")
+  .addBooleanField("b")
+  .build();
+
+  private BeamSqlEnv sqlEnv;
+
+  @Rule public TestPipeline pipeline = TestPipeline.create();
+
+  @Before
+  public void buildUp() {
+TestTableProvider tableProvider = new TestTableProvider();
+Table table = getTable("TEST", PushDownOptions.NONE);
+tableProvider.createTable(table);
+tableProvider.addRows(
+table.getName(),
+row(BASIC_SCHEMA, 100, 1, "one", (short) 100, true),
+row(BASIC_SCHEMA, 200, 2, "two", (short) 200, false));
+
+sqlEnv =
+BeamSqlEnv.builder(tableProvider)
+.setPipelineOptions(PipelineOptionsFactory.create())
+.build();
+  }
+
+  @Test
+  public void testIsSupported() {
+ImmutableList> sqlQueries =
+ImmutableList.of(
+Pair.of("select * from TEST where unused1=100", true),
+Pair.of("select * from TEST where unused1 in (100, 200)", true),
+Pair.of("select * from TEST where b", true),
+Pair.of("select * from TEST where not b", true),
+// Nested conjunction and disjunction is supported as long as 
child operations are
+// supported.
+Pair.of(
+"select * from TEST where unused1>100 and unused1<=200 and 
id<>1 and (name='two' or id=2)",
+true),
+// RegEx matching push-down is not implemented at the moment.
+Pair.of("select * from TEST where name like 'o%e'", false),
+// Complex operations, which modify a field before a comparison 
are not supported.
+Pair.of("select * from TEST where unused1+10=110", false),
+// Since unused2 is of type `short`, it will be cast to int32, 
making this a complex
+// operation.
+Pair.of("select * from TEST where unused2=200", false),
+// Operations involving more than 

[jira] [Work logged] (BEAM-8993) [SQL] MongoDb should use predicate push-down

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8993?focusedWorklogId=367846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367846
 ]

ASF GitHub Bot logged work on BEAM-8993:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:54
Start Date: 07/Jan/20 22:54
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #10417: 
[BEAM-8993] [SQL] MongoDB predicate push down.
URL: https://github.com/apache/beam/pull/10417#discussion_r363956576
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/mongodb/MongoDbFilter.java
 ##
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.meta.provider.mongodb;
+
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.AND;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.COMPARISON;
+import static 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind.OR;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTableFilter;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode;
+import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.SqlKind;
+import 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.sql.type.SqlTypeName;
+
+public class MongoDbFilter implements BeamSqlTableFilter {
 
 Review comment:
   You might consider if you can make this an inner class of `MongoDbTable`, 
since its only used in that class' `constructFilter` method. If that's too big 
of a change, I think it should at least be package-private.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367846)
Time Spent: 40m  (was: 0.5h)

> [SQL] MongoDb should use predicate push-down
> 
>
> Key: BEAM-8993
> URL: https://issues.apache.org/jira/browse/BEAM-8993
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> * Add a MongoDbFilter class, implementing BeamSqlTableFilter.
>  ** Support simple comparison operations.
>  ** Support boolean field.
>  ** Support nested conjunction/disjunction.
>  * Update MongoDbTable#buildIOReader
>  ** Construct a push-down filter from RexNodes.
>  ** Set filter to FindQuery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8937) Add a Jenkins job running GroupByKey load test on Java with Flink in Portability mode

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8937?focusedWorklogId=367841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367841
 ]

ASF GitHub Bot logged work on BEAM-8937:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:53
Start Date: 07/Jan/20 22:53
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10495: [BEAM-8937] Add 
Jenkins job definitions for GroupByKey Java load test on Flink
URL: https://github.com/apache/beam/pull/10495#issuecomment-571810659
 
 
   Not sure which suites were supposed to trigger here since it only changes 
groovy files
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367841)
Time Spent: 15h 50m  (was: 15h 40m)

> Add a Jenkins job running GroupByKey load test on Java with Flink in 
> Portability mode
> -
>
> Key: BEAM-8937
> URL: https://issues.apache.org/jira/browse/BEAM-8937
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 15h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8937) Add a Jenkins job running GroupByKey load test on Java with Flink in Portability mode

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8937?focusedWorklogId=367842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367842
 ]

ASF GitHub Bot logged work on BEAM-8937:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:53
Start Date: 07/Jan/20 22:53
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10495: [BEAM-8937] Add 
Jenkins job definitions for GroupByKey Java load test on Flink
URL: https://github.com/apache/beam/pull/10495#issuecomment-571810690
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367842)
Time Spent: 16h  (was: 15h 50m)

> Add a Jenkins job running GroupByKey load test on Java with Flink in 
> Portability mode
> -
>
> Key: BEAM-8937
> URL: https://issues.apache.org/jira/browse/BEAM-8937
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 16h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8960) Add an option for user to be able to opt out of using insert id for BigQuery streaming insert.

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8960?focusedWorklogId=367840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367840
 ]

ASF GitHub Bot logged work on BEAM-8960:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:52
Start Date: 07/Jan/20 22:52
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #10427: 
[BEAM-8960]: Add an option for user to opt out of using insert id for BigQuery 
streaming insert.
URL: https://github.com/apache/beam/pull/10427
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367840)
Remaining Estimate: 20h 10m  (was: 20h 20m)
Time Spent: 3h 50m  (was: 3h 40m)

> Add an option for user to be able to opt out of using insert id for BigQuery 
> streaming insert.
> --
>
> Key: BEAM-8960
> URL: https://issues.apache.org/jira/browse/BEAM-8960
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Yiru Tang
>Assignee: Yiru Tang
>Priority: Minor
>   Original Estimate: 24h
>  Time Spent: 3h 50m
>  Remaining Estimate: 20h 10m
>
> BigQuery streaming insert id offers best effort insert deduplication. If user 
> choose to opt out of using insert ids, they could potentially to be opt into 
> using our current new streaming backend which gives higher speed and more 
> quota. Insert id deduplication is best effort and doesn't have ultimate just 
> once guarantees.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8937) Add a Jenkins job running GroupByKey load test on Java with Flink in Portability mode

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8937?focusedWorklogId=367839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367839
 ]

ASF GitHub Bot logged work on BEAM-8937:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:50
Start Date: 07/Jan/20 22:50
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10495: [BEAM-8937] Add 
Jenkins job definitions for GroupByKey Java load test on Flink
URL: https://github.com/apache/beam/pull/10495#issuecomment-571809735
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367839)
Time Spent: 15h 40m  (was: 15.5h)

> Add a Jenkins job running GroupByKey load test on Java with Flink in 
> Portability mode
> -
>
> Key: BEAM-8937
> URL: https://issues.apache.org/jira/browse/BEAM-8937
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8937) Add a Jenkins job running GroupByKey load test on Java with Flink in Portability mode

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8937?focusedWorklogId=367838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367838
 ]

ASF GitHub Bot logged work on BEAM-8937:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:50
Start Date: 07/Jan/20 22:50
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10495: [BEAM-8937] Add 
Jenkins job definitions for GroupByKey Java load test on Flink
URL: https://github.com/apache/beam/pull/10495#issuecomment-571809676
 
 
   cool, this triggered a RAT check, but not other checks :(
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367838)
Time Spent: 15.5h  (was: 15h 20m)

> Add a Jenkins job running GroupByKey load test on Java with Flink in 
> Portability mode
> -
>
> Key: BEAM-8937
> URL: https://issues.apache.org/jira/browse/BEAM-8937
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 15.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8937) Add a Jenkins job running GroupByKey load test on Java with Flink in Portability mode

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8937?focusedWorklogId=367836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367836
 ]

ASF GitHub Bot logged work on BEAM-8937:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:49
Start Date: 07/Jan/20 22:49
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10495: [BEAM-8937] Add 
Jenkins job definitions for GroupByKey Java load test on Flink
URL: https://github.com/apache/beam/pull/10495#issuecomment-571809440
 
 
   ignore this comment
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367836)
Time Spent: 15h 20m  (was: 15h 10m)

> Add a Jenkins job running GroupByKey load test on Java with Flink in 
> Portability mode
> -
>
> Key: BEAM-8937
> URL: https://issues.apache.org/jira/browse/BEAM-8937
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michal Walenia
>Assignee: Michal Walenia
>Priority: Minor
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367835=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367835
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:46
Start Date: 07/Jan/20 22:46
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367835)
Time Spent: 3h 20m  (was: 3h 10m)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8717) Beam Dependency Update Request: org.apache.commons:commons-lang3

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8717:
---
Status: Open  (was: Triage Needed)

> Beam Dependency Update Request: org.apache.commons:commons-lang3
> 
>
> Key: BEAM-8717
> URL: https://issues.apache.org/jira/browse/BEAM-8717
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Priority: Major
>
>  - 2019-11-15 19:43:43.060362 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:11:02.203215 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:32.152530 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:47.060229 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:09.857528 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:21.614448 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:59.144846 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8717) Beam Dependency Update Request: org.apache.commons:commons-lang3

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-8717:
--

Assignee: Ismaël Mejía

> Beam Dependency Update Request: org.apache.commons:commons-lang3
> 
>
> Key: BEAM-8717
> URL: https://issues.apache.org/jira/browse/BEAM-8717
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Ismaël Mejía
>Priority: Major
>
>  - 2019-11-15 19:43:43.060362 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:11:02.203215 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:17:32.152530 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:16:47.060229 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:17:09.857528 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:12:21.614448 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:15:59.144846 
> -
> Please consider upgrading the dependency 
> org.apache.commons:commons-lang3. 
> The current version is 3.6. The latest version is 3.9 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367830
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:26
Start Date: 07/Jan/20 22:26
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504#issuecomment-571801799
 
 
   Interesting, so looks like Python2_PVR_Flink PreCommit,  Portable_Python 
PreCommit Jenkins suites are still not triggered, although they did trigger 
before. Anyway, as long as Python PreCommit and PythonLint suites pass, this PR 
should be safe to merge.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367830)
Time Spent: 3h 10m  (was: 3h)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367822
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:19
Start Date: 07/Jan/20 22:19
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504#issuecomment-571799700
 
 
   For some reason Jenkins does not trigger precommits on this PR, let's try 
some manual commands to trigger them.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367822)
Time Spent: 3h  (was: 2h 50m)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367821
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:19
Start Date: 07/Jan/20 22:19
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504#issuecomment-571799666
 
 
   I'll try to copy-paste an earlier comment that triggered execution of all 
precommit jobs and see what happens.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367821)
Time Spent: 2h 50m  (was: 2h 40m)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367818
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:18
Start Date: 07/Jan/20 22:18
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504#issuecomment-571799333
 
 
   precommits
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367818)
Time Spent: 2h 40m  (was: 2.5h)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367817
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:17
Start Date: 07/Jan/20 22:17
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504#issuecomment-571799127
 
 
   ignore this comment
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367817)
Time Spent: 2.5h  (was: 2h 20m)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9062) Improve Beam's assert_that+equal_to error message to describe what elements cause assertion to fail

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9062?focusedWorklogId=367810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367810
 ]

ASF GitHub Bot logged work on BEAM-9062:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:06
Start Date: 07/Jan/20 22:06
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #10504: [BEAM-9062] 
Improve assertion error for equal_to
URL: https://github.com/apache/beam/pull/10504#issuecomment-571795276
 
 
   Run PythonLint PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367810)
Time Spent: 2h 20m  (was: 2h 10m)

> Improve Beam's assert_that+equal_to error message to describe what elements 
> cause assertion to fail 
> 
>
> Key: BEAM-9062
> URL: https://issues.apache.org/jira/browse/BEAM-9062
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8496) remove SDF translators in flink streaming transform translator

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8496?focusedWorklogId=367809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367809
 ]

ASF GitHub Bot logged work on BEAM-8496:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:05
Start Date: 07/Jan/20 22:05
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9903: [BEAM-8496] remove 
SDF translators from flink translator
URL: https://github.com/apache/beam/pull/9903#issuecomment-571795045
 
 
   Run SQL Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367809)
Time Spent: 4h 20m  (was: 4h 10m)

> remove SDF translators in flink streaming transform translator
> --
>
> Key: BEAM-8496
> URL: https://issues.apache.org/jira/browse/BEAM-8496
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Since URN of SDF has been moved to runners-core-construction-java, we need to 
> remove it.
> Otherwise, in failed nexmark Jenkins 
> [job|https://builds.apache.org/job/beam_PostCommit_Java_Nexmark_Flink/4128/console],
>  it causes duplicated transformer registered in 
> [PTransformTranslation.KnownTransformPayloadTranslator()|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java#L290]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8496) remove SDF translators in flink streaming transform translator

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8496?focusedWorklogId=367808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367808
 ]

ASF GitHub Bot logged work on BEAM-8496:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:05
Start Date: 07/Jan/20 22:05
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9903: [BEAM-8496] remove 
SDF translators from flink translator
URL: https://github.com/apache/beam/pull/9903#issuecomment-571794951
 
 
   Run SQL Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367808)
Time Spent: 4h 10m  (was: 4h)

> remove SDF translators in flink streaming transform translator
> --
>
> Key: BEAM-8496
> URL: https://issues.apache.org/jira/browse/BEAM-8496
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Since URN of SDF has been moved to runners-core-construction-java, we need to 
> remove it.
> Otherwise, in failed nexmark Jenkins 
> [job|https://builds.apache.org/job/beam_PostCommit_Java_Nexmark_Flink/4128/console],
>  it causes duplicated transformer registered in 
> [PTransformTranslation.KnownTransformPayloadTranslator()|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java#L290]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8496) remove SDF translators in flink streaming transform translator

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8496?focusedWorklogId=367807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367807
 ]

ASF GitHub Bot logged work on BEAM-8496:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:05
Start Date: 07/Jan/20 22:05
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9903: [BEAM-8496] remove 
SDF translators from flink translator
URL: https://github.com/apache/beam/pull/9903#issuecomment-571794743
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367807)
Time Spent: 4h  (was: 3h 50m)

> remove SDF translators in flink streaming transform translator
> --
>
> Key: BEAM-8496
> URL: https://issues.apache.org/jira/browse/BEAM-8496
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kai Jiang
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Since URN of SDF has been moved to runners-core-construction-java, we need to 
> remove it.
> Otherwise, in failed nexmark Jenkins 
> [job|https://builds.apache.org/job/beam_PostCommit_Java_Nexmark_Flink/4128/console],
>  it causes duplicated transformer registered in 
> [PTransformTranslation.KnownTransformPayloadTranslator()|https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java#L290]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8794) Projects should be handled by an IOPushDownRule before applying AggregateProjectMergeRule

2020-01-07 Thread Kirill Kozlov (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Kozlov resolved BEAM-8794.
-
Fix Version/s: 2.18.0
   Resolution: Fixed

> Projects should be handled by an IOPushDownRule before applying 
> AggregateProjectMergeRule
> -
>
> Key: BEAM-8794
> URL: https://issues.apache.org/jira/browse/BEAM-8794
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> It is more efficient to push-down projected fields at an IO level (vs merging 
> with an Aggregate), when supported.
> When running queries like:
> {code:java}
> select SUM(score) as total_score from  group by name{code}
> Projects get merged with an aggregate, as a result Calc (after an 
> IOSourceRel) projects all fields and BeamIOPushDown rule does know what 
> fields can be dropped, thus not dropping any.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8624) Implement FnService for status api in Dataflow runner

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8624?focusedWorklogId=367802=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367802
 ]

ASF GitHub Bot logged work on BEAM-8624:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:01
Start Date: 07/Jan/20 22:01
Worklog Time Spent: 10m 
  Work Description: y1chi commented on issue #10115: [BEAM-8624] Implement 
Worker Status FnService in Dataflow runner
URL: https://github.com/apache/beam/pull/10115#issuecomment-571793463
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367802)
Time Spent: 14h  (was: 13h 50m)

> Implement FnService for status api in Dataflow runner
> -
>
> Key: BEAM-8624
> URL: https://issues.apache.org/jira/browse/BEAM-8624
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8951) Stop using nose in load tests

2020-01-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8951?focusedWorklogId=367803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-367803
 ]

ASF GitHub Bot logged work on BEAM-8951:


Author: ASF GitHub Bot
Created on: 07/Jan/20 22:01
Start Date: 07/Jan/20 22:01
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #10435: [BEAM-8951] 
Stop using nose in load tests
URL: https://github.com/apache/beam/pull/10435#discussion_r363956004
 
 

 ##
 File path: sdks/python/apache_beam/testing/load_tests/load_test.py
 ##
 @@ -17,19 +17,50 @@
 from __future__ import absolute_import
 
 import json
-import logging
-import unittest
 
 from apache_beam.metrics import MetricsFilter
 from apache_beam.testing.load_tests.load_test_metrics_utils import 
MetricsReader
 from apache_beam.testing.test_pipeline import TestPipeline
 
 
-class LoadTest(unittest.TestCase):
-  def parseTestPipelineOptions(self, options=None):
+class LoadTest(object):
+  def __init__(self):
+self.pipeline = TestPipeline(is_integration_test=True)
 
 Review comment:
   I spoke with @udim - if we migrate these tests to use pytest later, sticking 
with `--test-pipeline-options` may ease this migration. we haven't fully paved 
the path to switch to pytest, but @udim has an (old) in-progress PR 
https://github.com/apache/beam/pull/7949 in case you are curious to give pytest 
a try for loadtest purposes.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 367803)
Time Spent: 4h 40m  (was: 4.5h)

> Stop using nose in load tests
> -
>
> Key: BEAM-8951
> URL: https://issues.apache.org/jira/browse/BEAM-8951
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The community is considering moving away from nose to pytest: 
> https://issues.apache.org/jira/browse/BEAM-3713. We should change the way of 
> running Python load tests: instead of being subclasses of 
> `unittest.TestCase`, they could be plain Python scripts, just like wordcount 
> examples. This will bring one additional benefit: _LOAD_TEST_ENABLED_ guard 
> will be no longer needed and could be safely removed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9027) [SQL] ZetaSQL unparsing should produce valid result

2020-01-07 Thread Kirill Kozlov (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010124#comment-17010124
 ] 

Kirill Kozlov commented on BEAM-9027:
-

Brief update on this issue.

Following unparsing error have been fixed:
 * Calcite cannot unparse RexNode back to bytes literal
 * Calcite cannot unparse some floating point literals correctly
 * Calcite cannot unparse some string literals correctly

Other issues from the initial list (INTERVAL and CAST) still need to be fixed.

> [SQL] ZetaSQL unparsing should produce valid result
> ---
>
> Key: BEAM-9027
> URL: https://issues.apache.org/jira/browse/BEAM-9027
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql-zetasql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> * ZetaSQL does not recognize keyword INTERVAL
>  * Calcite cannot unparse RexNode back to bytes literal
>  * Calcite cannot unparse some floating point literals correctly
>  * Calcite cannot unparse some string literals correctly
>  * Calcite cannot unparse types correctly for CAST function



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9045) Implement an Ignite runner using Apache Ignite compute grid

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9045:
---
Status: Open  (was: Triage Needed)

> Implement an Ignite runner using Apache Ignite compute grid
> ---
>
> Key: BEAM-9045
> URL: https://issues.apache.org/jira/browse/BEAM-9045
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Saikat Maitra
>Assignee: Saikat Maitra
>Priority: Major
>
> Implement an Ignite runner using Apache Ignite compute grid.
> Runner guide [https://beam.apache.org/contribute/runner-guide/]
> Capability Matrix 
> [https://beam.apache.org/documentation/runners/capability-matrix/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8593) Define expected behavior of running ZetaSQL query on tables with unsupported field types

2020-01-07 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8593:
---
Status: Open  (was: Triage Needed)

> Define expected behavior of running ZetaSQL query on tables with unsupported 
> field types
> 
>
> Key: BEAM-8593
> URL: https://issues.apache.org/jira/browse/BEAM-8593
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql-zetasql
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> What should be the expected behavior if a user run a ZetaSQL query on a table 
> with a field type (e.g. MAP) that is not supported by ZetaSQL?
> More context: 
> [https://github.com/apache/beam/pull/10020#issuecomment-551368105]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >