[jira] [Work logged] (BEAM-7810) Allow ValueProvider arguments to ReadFromDatastore

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7810?focusedWorklogId=378091&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378091
 ]

ASF GitHub Bot logged work on BEAM-7810:


Author: ASF GitHub Bot
Created on: 28/Jan/20 08:08
Start Date: 28/Jan/20 08:08
Worklog Time Spent: 10m 
  Work Description: EDjur commented on issue #10683: [BEAM-7810] Added 
ValueProvider support for Datastore query namespaces
URL: https://github.com/apache/beam/pull/10683#issuecomment-579127724
 
 
   Cheers for opening and fixing that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378091)
Time Spent: 1h 10m  (was: 1h)

> Allow ValueProvider arguments to ReadFromDatastore
> --
>
> Key: BEAM-7810
> URL: https://issues.apache.org/jira/browse/BEAM-7810
> Project: Beam
>  Issue Type: New Feature
>  Components: io-py-gcp
>Reporter: Udi Meiri
>Assignee: Elias Djurfeldt
>Priority: Minor
> Fix For: 2.20.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> From: 
> https://stackoverflow.com/questions/56748893/trying-to-achieve-runtime-value-of-namespace-of-datastore-in-dataflow-template



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378095&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378095
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 08:23
Start Date: 28/Jan/20 08:23
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579132447
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378095)
Time Spent: 1h 20m  (was: 1h 10m)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378096&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378096
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 08:23
Start Date: 28/Jan/20 08:23
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579132447
 
 
   Run JavaPortabilityApi PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378096)
Time Spent: 1.5h  (was: 1h 20m)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378097
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 08:25
Start Date: 28/Jan/20 08:25
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579133146
 
 
   @suztomo maybe you know a way I can run the linkagechecker analysis in the 
full set of Beam modules? I think is more scalable to have a task for that that 
we invoke during PRs to validate that no regressions are included as suggested 
by Luke. (I can do that in Maven but my gradle-fu is still not good enough).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378097)
Time Spent: 1h 40m  (was: 1.5h)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits

2020-01-28 Thread Jira
Ismaël Mejía created BEAM-9204:
--

 Summary: HBase SDF @SplitRestriction does not take the range input 
into account to restrict splits
 Key: BEAM-9204
 URL: https://issues.apache.org/jira/browse/BEAM-9204
 Project: Beam
  Issue Type: Bug
  Components: io-java-hbase
Reporter: Ismaël Mejía
Assignee: Ismaël Mejía






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9204:
---
Status: Open  (was: Triage Needed)

> HBase SDF @SplitRestriction does not take the range input into account to 
> restrict splits
> -
>
> Key: BEAM-9204
> URL: https://issues.apache.org/jira/browse/BEAM-9204
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-hbase
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>
> This is an issue because it is common if split is called multiple times work 
> this will produce repeated work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9204:
---
Description: This is an issue because it is common if split is called 
multiple times work this will produce repeated work.

> HBase SDF @SplitRestriction does not take the range input into account to 
> restrict splits
> -
>
> Key: BEAM-9204
> URL: https://issues.apache.org/jira/browse/BEAM-9204
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-hbase
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>
> This is an issue because it is common if split is called multiple times work 
> this will produce repeated work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-4735) Make HBaseIO.read() based on SDF

2020-01-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024990#comment-17024990
 ] 

Ismaël Mejía commented on BEAM-4735:


Oh interesting finding! Just filled BEAM-9204 to tackle it.

> Make HBaseIO.read() based on SDF
> 
>
> Key: BEAM-4735
> URL: https://issues.apache.org/jira/browse/BEAM-4735
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-hbase
>Reporter: Ismaël Mejía
>Priority: Minor
>
> BEAM-4020 introduces HBaseIO reads based on SDF. So far the read() method 
> still uses the Source based API for two reasons:
> 1. Most distributed runners don't supports Bounded SDF today.
> 2. SDF does not support Dynamic Work Rebalancing but the Source API of HBase 
> already supports it so changing it means losing some functionality.
> Once there is improvements in both (1) and (2) we should consider moving the 
> main read() function to use the SDF API and remove the Source based 
> implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9204?focusedWorklogId=378121&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378121
 ]

ASF GitHub Bot logged work on BEAM-9204:


Author: ASF GitHub Bot
Created on: 28/Jan/20 09:45
Start Date: 28/Jan/20 09:45
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10700: [BEAM-9204] 
Fix HBase SplitRestriction to be based on provided Range
URL: https://github.com/apache/beam/pull/10700
 
 
   R: @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378121)
Remaining Estimate: 0h
Time Spent: 10m

> HBase SDF @SplitRestriction does not take the range input into account to 
> restrict splits
> -
>
> Key: BEAM-9204
> URL: https://issues.apache.org/jira/browse/BEAM-9204
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-hbase
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is an issue because it is common if split is called multiple times work 
> this will produce repeated work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9204) HBase SDF @SplitRestriction does not take the range input into account to restrict splits

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9204:
---
Description: This is an issue because that the split is called multiple 
times and in this cas it will produce repeated work.  (was: This is an issue 
because it is common if split is called multiple times work this will produce 
repeated work.)

> HBase SDF @SplitRestriction does not take the range input into account to 
> restrict splits
> -
>
> Key: BEAM-9204
> URL: https://issues.apache.org/jira/browse/BEAM-9204
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-hbase
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This is an issue because that the split is called multiple times and in this 
> cas it will produce repeated work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9205) Regression in validates runner tests configuration in spark module

2020-01-28 Thread Etienne Chauchot (Jira)
Etienne Chauchot created BEAM-9205:
--

 Summary: Regression in validates runner tests configuration in 
spark module
 Key: BEAM-9205
 URL: https://issues.apache.org/jira/browse/BEAM-9205
 Project: Beam
  Issue Type: Test
  Components: runner-spark
Reporter: Etienne Chauchot
Assignee: Etienne Chauchot


Not all the metrics tests are run: at least MetricsPusher is no more run



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9205) Regression in validates runner tests configuration in spark module

2020-01-28 Thread Etienne Chauchot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-9205:
---
Parent: BEAM-3310
Issue Type: Sub-task  (was: Test)

> Regression in validates runner tests configuration in spark module
> --
>
> Key: BEAM-9205
> URL: https://issues.apache.org/jira/browse/BEAM-9205
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
>
> Not all the metrics tests are run: at least MetricsPusher is no more run



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9205) Regression in validates runner tests configuration in spark module

2020-01-28 Thread Etienne Chauchot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot closed BEAM-9205.
--
Fix Version/s: 2.20.0
   Resolution: Fixed

> Regression in validates runner tests configuration in spark module
> --
>
> Key: BEAM-9205
> URL: https://issues.apache.org/jira/browse/BEAM-9205
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Major
> Fix For: 2.20.0
>
>
> Not all the metrics tests are run: at least MetricsPusher is no more run



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378151&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378151
 ]

ASF GitHub Bot logged work on BEAM-8972:


Author: ASF GitHub Bot
Created on: 28/Jan/20 10:59
Start Date: 28/Jan/20 10:59
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add 
Jenkins job with Combine test for portable Java
URL: https://github.com/apache/beam/pull/10386#issuecomment-579191281
 
 
   Run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378151)
Time Spent: 4.5h  (was: 4h 20m)

> Add a Jenkins job running Combine load test on Java with Flink in Portability 
> mode
> --
>
> Key: BEAM-8972
> URL: https://issues.apache.org/jira/browse/BEAM-8972
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378152&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378152
 ]

ASF GitHub Bot logged work on BEAM-8972:


Author: ASF GitHub Bot
Created on: 28/Jan/20 11:01
Start Date: 28/Jan/20 11:01
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add 
Jenkins job with Combine test for portable Java
URL: https://github.com/apache/beam/pull/10386#issuecomment-579192080
 
 
   Run Load Tests Java Combine Portable Flink Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378152)
Time Spent: 4h 40m  (was: 4.5h)

> Add a Jenkins job running Combine load test on Java with Flink in Portability 
> mode
> --
>
> Key: BEAM-8972
> URL: https://issues.apache.org/jira/browse/BEAM-8972
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378154&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378154
 ]

ASF GitHub Bot logged work on BEAM-8972:


Author: ASF GitHub Bot
Created on: 28/Jan/20 11:02
Start Date: 28/Jan/20 11:02
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add 
Jenkins job with Combine test for portable Java
URL: https://github.com/apache/beam/pull/10386#issuecomment-57919
 
 
   Run Load Tests Java Combine Portable Flink Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378154)
Time Spent: 5h  (was: 4h 50m)

> Add a Jenkins job running Combine load test on Java with Flink in Portability 
> mode
> --
>
> Key: BEAM-8972
> URL: https://issues.apache.org/jira/browse/BEAM-8972
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378153&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378153
 ]

ASF GitHub Bot logged work on BEAM-8972:


Author: ASF GitHub Bot
Created on: 28/Jan/20 11:02
Start Date: 28/Jan/20 11:02
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add 
Jenkins job with Combine test for portable Java
URL: https://github.com/apache/beam/pull/10386#issuecomment-579192080
 
 
   Run Load Tests Java Combine Portable Flink Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378153)
Time Spent: 4h 50m  (was: 4h 40m)

> Add a Jenkins job running Combine load test on Java with Flink in Portability 
> mode
> --
>
> Key: BEAM-8972
> URL: https://issues.apache.org/jira/browse/BEAM-8972
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8925) Beam Dependency Update Request: org.apache.tika:tika-core

2020-01-28 Thread Colm O hEigeartaigh (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colm O hEigeartaigh reassigned BEAM-8925:
-

Assignee: Colm O hEigeartaigh

> Beam Dependency Update Request: org.apache.tika:tika-core
> -
>
> Key: BEAM-8925
> URL: https://issues.apache.org/jira/browse/BEAM-8925
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Colm O hEigeartaigh
>Priority: Major
>
>  - 2019-12-09 12:20:22.212496 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:20:53.356760 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:15:58.081400 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:19:33.456649 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-13 12:18:38.940974 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-20 12:16:03.428169 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-27 12:17:01.302466 
> -
> Please consider upgrading the dependency org.apache.tika:tika-core. 
> The current version is 1.20. The latest version is 1.23 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378163
 ]

ASF GitHub Bot logged work on BEAM-9063:


Author: ASF GitHub Bot
Created on: 28/Jan/20 11:38
Start Date: 28/Jan/20 11:38
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #10612: [DO NOT 
MERGE][BEAM-9063] migrate docker images to apache
URL: https://github.com/apache/beam/pull/10612#issuecomment-579205196
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378163)
Time Spent: 5.5h  (was: 5h 20m)

> Migrate docker images to apache namespace.
> --
>
> Key: BEAM-9063
> URL: https://issues.apache.org/jira/browse/BEAM-9063
> Project: Beam
>  Issue Type: Task
>  Components: beam-community
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> https://hub.docker.com/u/apache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378167&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378167
 ]

ASF GitHub Bot logged work on BEAM-9063:


Author: ASF GitHub Bot
Created on: 28/Jan/20 11:44
Start Date: 28/Jan/20 11:44
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10612: [DO NOT 
MERGE][BEAM-9063] migrate docker images to apache
URL: https://github.com/apache/beam/pull/10612#discussion_r371752564
 
 

 ##
 File path: release/src/main/scripts/build_release_candidate.sh
 ##
 @@ -45,6 +45,9 @@ PYTHON_ARTIFACTS_DIR=python
 BEAM_ROOT_DIR=beam
 WEBSITE_ROOT_DIR=beam-site
 
+DOCKER_IMAGE_DEFAULT_REPO_ROOT=apache
+DOCKER_IMAGE_DEFAULT_REPO_PREFIX=beam_
 
 Review comment:
   Would it be worth sourcing these from a script?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378167)
Time Spent: 5h 40m  (was: 5.5h)

> Migrate docker images to apache namespace.
> --
>
> Key: BEAM-9063
> URL: https://issues.apache.org/jira/browse/BEAM-9063
> Project: Beam
>  Issue Type: Task
>  Components: beam-community
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> https://hub.docker.com/u/apache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378168
 ]

ASF GitHub Bot logged work on BEAM-9063:


Author: ASF GitHub Bot
Created on: 28/Jan/20 11:44
Start Date: 28/Jan/20 11:44
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10612: [DO NOT 
MERGE][BEAM-9063] migrate docker images to apache
URL: https://github.com/apache/beam/pull/10612#discussion_r371752615
 
 

 ##
 File path: release/src/main/scripts/publish_docker_images.sh
 ##
 @@ -24,6 +24,9 @@
 
 set -e
 
+DOCKER_IMAGE_DEFAULT_REPO_ROOT=apache
+DOCKER_IMAGE_DEFAULT_REPO_PREFIX=beam_
 
 Review comment:
   Would it be worth sourcing these from a script?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378168)
Time Spent: 5h 40m  (was: 5.5h)

> Migrate docker images to apache namespace.
> --
>
> Key: BEAM-9063
> URL: https://issues.apache.org/jira/browse/BEAM-9063
> Project: Beam
>  Issue Type: Task
>  Components: beam-community
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> https://hub.docker.com/u/apache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378184
 ]

ASF GitHub Bot logged work on BEAM-8972:


Author: ASF GitHub Bot
Created on: 28/Jan/20 12:30
Start Date: 28/Jan/20 12:30
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add 
Jenkins job with Combine test for portable Java
URL: https://github.com/apache/beam/pull/10386#issuecomment-579223188
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378184)
Time Spent: 5h 10m  (was: 5h)

> Add a Jenkins job running Combine load test on Java with Flink in Portability 
> mode
> --
>
> Key: BEAM-8972
> URL: https://issues.apache.org/jira/browse/BEAM-8972
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8550) @RequiresTimeSortedInput DoFn annotation

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8550?focusedWorklogId=378199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378199
 ]

ASF GitHub Bot logged work on BEAM-8550:


Author: ASF GitHub Bot
Created on: 28/Jan/20 12:51
Start Date: 28/Jan/20 12:51
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #8774: [BEAM-8550] Requires 
time sorted input
URL: https://github.com/apache/beam/pull/8774#issuecomment-579230688
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378199)
Time Spent: 9h 20m  (was: 9h 10m)

> @RequiresTimeSortedInput DoFn annotation
> 
>
> Key: BEAM-8550
> URL: https://issues.apache.org/jira/browse/BEAM-8550
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Implement new annotation {{@RequiresTimeSortedInput}} for stateful DoFn as 
> described in [design 
> document|https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/edit?usp=sharing].
>  First implementation will assume that:
>   - time is defined by timestamp in associated WindowedValue
>   - allowed lateness is explicitly zero and all late elements are dropped 
> (due to being out of order)
> The above properties are considered temporary and will be resolved by 
> subsequent extensions (backwards compatible).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378206
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 12:57
Start Date: 28/Jan/20 12:57
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support 
outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579232716
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378206)
Time Spent: 13h  (was: 12h 50m)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378207
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 12:57
Start Date: 28/Jan/20 12:57
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support 
outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579232716
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378207)
Time Spent: 13h 10m  (was: 13h)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378208
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 12:57
Start Date: 28/Jan/20 12:57
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support 
outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579232967
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378208)
Time Spent: 13h 20m  (was: 13h 10m)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8972) Add a Jenkins job running Combine load test on Java with Flink in Portability mode

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8972?focusedWorklogId=378212&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378212
 ]

ASF GitHub Bot logged work on BEAM-8972:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:06
Start Date: 28/Jan/20 13:06
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10386: [BEAM-8972] Add 
Jenkins job with Combine test for portable Java
URL: https://github.com/apache/beam/pull/10386#issuecomment-579236550
 
 
   Run Load Tests Java Combine Portable Flink Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378212)
Time Spent: 5h 20m  (was: 5h 10m)

> Add a Jenkins job running Combine load test on Java with Flink in Portability 
> mode
> --
>
> Key: BEAM-8972
> URL: https://issues.apache.org/jira/browse/BEAM-8972
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8298) Implement state caching for side inputs

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8298?focusedWorklogId=378222&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378222
 ]

ASF GitHub Bot logged work on BEAM-8298:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:21
Start Date: 28/Jan/20 13:21
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #10679: [BEAM-8298] Fully 
specify the necessary details to support side input caching.
URL: https://github.com/apache/beam/pull/10679#issuecomment-579242154
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378222)
Time Spent: 1h  (was: 50m)

> Implement state caching for side inputs
> ---
>
> Key: BEAM-8298
> URL: https://issues.apache.org/jira/browse/BEAM-8298
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-py-harness
>Reporter: Maximilian Michels
>Assignee: Jing Chen
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Caching is currently only implemented for user state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378223&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378223
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:25
Start Date: 28/Jan/20 13:25
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #10644: [BEAM-7427] 
Refactore JmsCheckpointMark to be usage via Coder
URL: https://github.com/apache/beam/pull/10644#issuecomment-579243908
 
 
   As discussed, I've keep `JmsCheckpointMark` in a dedicated class.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378223)
Time Spent: 8h 10m  (was: 8h)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Mourad
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378225&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378225
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:26
Start Date: 28/Jan/20 13:26
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10627: [BEAM-2535] Support 
outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579244488
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378225)
Time Spent: 13.5h  (was: 13h 20m)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378227&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378227
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:26
Start Date: 28/Jan/20 13:26
Worklog Time Spent: 10m 
  Work Description: mwalenia commented on issue #10627: [BEAM-2535] Support 
outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579232967
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378227)
Time Spent: 13h 40m  (was: 13.5h)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378228&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378228
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:28
Start Date: 28/Jan/20 13:28
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10627: [BEAM-2535] Support 
outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579245093
 
 
   Run Direct ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378228)
Time Spent: 13h 50m  (was: 13h 40m)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9146) [Python] PTransform that integrates Video Intelligence functionality

2020-01-28 Thread Elias Djurfeldt (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025109#comment-17025109
 ] 

Elias Djurfeldt commented on BEAM-9146:
---

Are there any other PTransforms that call external API's in Beam? I'm working 
on implementing this but running into some design considerations regarding 
mocking the video intelligence API for test purposes.

> [Python] PTransform that integrates Video Intelligence functionality
> 
>
> Key: BEAM-9146
> URL: https://issues.apache.org/jira/browse/BEAM-9146
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-py-gcp
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>
> The goal is to create a PTransform that integrates Google Cloud Video 
> Intelligence functionality [1].
> The transform should be able to take both video GCS location or video data 
> bytes as an input.
> The transform should be put into _sdks/python/apache_beam/io/gcp/ai_ folder.
> [1] https://cloud.google.com/video-intelligence/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378246&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378246
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:50
Start Date: 28/Jan/20 13:50
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10644: [BEAM-7427] 
Refactore JmsCheckpointMark to be usage via Coder
URL: https://github.com/apache/beam/pull/10644#issuecomment-579254727
 
 
   Sorry for the delay, the extra commit with the fixes looks good. I was 
thinking that since the stored messages are not needed to restore the progress 
of the reads on `UnboundedJmsReader` maybe the simplest fix is just to let them 
transient as you proposed.
   About the State changes maybe let's do those in a subsequent PR so we can 
get this fix out more quickly. WDYT If you agree just let the class as it was 
before and then I will merge.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378246)
Time Spent: 8h 20m  (was: 8h 10m)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Mourad
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378249&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378249
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 13:52
Start Date: 28/Jan/20 13:52
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10644: [BEAM-7427] 
Refactore JmsCheckpointMark to be usage via Coder
URL: https://github.com/apache/beam/pull/10644#issuecomment-579255395
 
 
   Oh you already get rid of state hehe, my bad ok looking again.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378249)
Time Spent: 8.5h  (was: 8h 20m)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Mourad
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378253
 ]

ASF GitHub Bot logged work on BEAM-8618:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:06
Start Date: 28/Jan/20 14:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear 
down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r371817547
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -280,6 +283,7 @@ def get(self, instruction_id, bundle_descriptor_id):
 try:
   # pop() is threadsafe
   processor = self.cached_bundle_processors[bundle_descriptor_id].pop()
+  self.last_access_time[bundle_descriptor_id] = time.time()
 except IndexError:
 
 Review comment:
   This won't update the access time when we first create the processor in the 
except block.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378253)
Time Spent: 40m  (was: 0.5h)

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378255
 ]

ASF GitHub Bot logged work on BEAM-8618:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:06
Start Date: 28/Jan/20 14:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear 
down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r371817978
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -315,18 +319,47 @@ def release(self, instruction_id):
 """
 descriptor_id, processor = 
self.active_bundle_processors.pop(instruction_id)
 processor.reset()
+self.last_access_time[descriptor_id] = time.time()
 self.cached_bundle_processors[descriptor_id].append(processor)
 
   def shutdown(self):
 """
 Shutdown all ``BundleProcessor``s in the cache.
 """
+if self.periodic_shutdown:
+  self.periodic_shutdown.cancel()
+  self.periodic_shutdown.join()
+  self.periodic_shutdown = None
+
 for instruction_id in self.active_bundle_processors:
   self.active_bundle_processors[instruction_id][1].shutdown()
   del self.active_bundle_processors[instruction_id]
 for cached_bundle_processors in self.cached_bundle_processors.values():
-  while len(cached_bundle_processors) > 0:
-cached_bundle_processors.pop().shutdown()
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  cached_bundle_processors)
+
+  def _schedule_periodic_shutdown(self):
+def shutdown_inactive_bundle_processors():
+  for descriptor_id, last_access_time in self.last_access_time.items():
+if time.time() - last_access_time > 60:
 
 Review comment:
   We may want to make this configurable.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378255)
Time Spent: 1h  (was: 50m)

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378257&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378257
 ]

ASF GitHub Bot logged work on BEAM-8618:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:06
Start Date: 28/Jan/20 14:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear 
down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r371819018
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -315,18 +319,47 @@ def release(self, instruction_id):
 """
 descriptor_id, processor = 
self.active_bundle_processors.pop(instruction_id)
 processor.reset()
+self.last_access_time[descriptor_id] = time.time()
 self.cached_bundle_processors[descriptor_id].append(processor)
 
   def shutdown(self):
 """
 Shutdown all ``BundleProcessor``s in the cache.
 """
+if self.periodic_shutdown:
+  self.periodic_shutdown.cancel()
+  self.periodic_shutdown.join()
+  self.periodic_shutdown = None
+
 for instruction_id in self.active_bundle_processors:
   self.active_bundle_processors[instruction_id][1].shutdown()
   del self.active_bundle_processors[instruction_id]
 for cached_bundle_processors in self.cached_bundle_processors.values():
-  while len(cached_bundle_processors) > 0:
-cached_bundle_processors.pop().shutdown()
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  cached_bundle_processors)
+
+  def _schedule_periodic_shutdown(self):
+def shutdown_inactive_bundle_processors():
+  for descriptor_id, last_access_time in self.last_access_time.items():
+if time.time() - last_access_time > 60:
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  self.cached_bundle_processors[descriptor_id])
+
+from apache_beam.runners.worker.data_plane import PeriodicThread
 
 Review comment:
   I think we should move this to the import section.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378257)
Time Spent: 1h 10m  (was: 1h)

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378254
 ]

ASF GitHub Bot logged work on BEAM-8618:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:06
Start Date: 28/Jan/20 14:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear 
down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r371820931
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -315,18 +319,47 @@ def release(self, instruction_id):
 """
 descriptor_id, processor = 
self.active_bundle_processors.pop(instruction_id)
 processor.reset()
+self.last_access_time[descriptor_id] = time.time()
 self.cached_bundle_processors[descriptor_id].append(processor)
 
   def shutdown(self):
 """
 Shutdown all ``BundleProcessor``s in the cache.
 """
+if self.periodic_shutdown:
+  self.periodic_shutdown.cancel()
+  self.periodic_shutdown.join()
+  self.periodic_shutdown = None
+
 for instruction_id in self.active_bundle_processors:
   self.active_bundle_processors[instruction_id][1].shutdown()
   del self.active_bundle_processors[instruction_id]
 for cached_bundle_processors in self.cached_bundle_processors.values():
-  while len(cached_bundle_processors) > 0:
-cached_bundle_processors.pop().shutdown()
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  cached_bundle_processors)
+
+  def _schedule_periodic_shutdown(self):
+def shutdown_inactive_bundle_processors():
+  for descriptor_id, last_access_time in self.last_access_time.items():
+if time.time() - last_access_time > 60:
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  self.cached_bundle_processors[descriptor_id])
 
 Review comment:
   Don't we have to remove the bundle processor list from the dictionary? 
Otherwise we may access a cached shutdown bundle processor.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378254)
Time Spent: 50m  (was: 40m)

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=378256&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378256
 ]

ASF GitHub Bot logged work on BEAM-8618:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:06
Start Date: 28/Jan/20 14:06
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10655: [BEAM-8618] Tear 
down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r371818162
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -315,18 +319,47 @@ def release(self, instruction_id):
 """
 descriptor_id, processor = 
self.active_bundle_processors.pop(instruction_id)
 processor.reset()
+self.last_access_time[descriptor_id] = time.time()
 self.cached_bundle_processors[descriptor_id].append(processor)
 
   def shutdown(self):
 """
 Shutdown all ``BundleProcessor``s in the cache.
 """
+if self.periodic_shutdown:
+  self.periodic_shutdown.cancel()
+  self.periodic_shutdown.join()
+  self.periodic_shutdown = None
+
 for instruction_id in self.active_bundle_processors:
   self.active_bundle_processors[instruction_id][1].shutdown()
   del self.active_bundle_processors[instruction_id]
 for cached_bundle_processors in self.cached_bundle_processors.values():
-  while len(cached_bundle_processors) > 0:
-cached_bundle_processors.pop().shutdown()
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  cached_bundle_processors)
+
+  def _schedule_periodic_shutdown(self):
+def shutdown_inactive_bundle_processors():
+  for descriptor_id, last_access_time in self.last_access_time.items():
+if time.time() - last_access_time > 60:
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  self.cached_bundle_processors[descriptor_id])
+
+from apache_beam.runners.worker.data_plane import PeriodicThread
+self.periodic_shutdown = PeriodicThread(
+60, shutdown_inactive_bundle_processors)
 
 Review comment:
   Same here. Should be configurable or at least be extracted to a variable.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378256)
Time Spent: 1h  (was: 50m)

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7427:
---
Fix Version/s: 2.20.0

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7427:
---
Affects Version/s: 2.19.0
   2.13.0
   2.14.0
   2.15.0
   2.16.0
   2.17.0
   2.18.0

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-7427:
--

Assignee: Jean-Baptiste Onofré  (was: Mourad)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378258&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378258
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:15
Start Date: 28/Jan/20 14:15
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #10644: [BEAM-7427] 
Refactore JmsCheckpointMark to be usage via Coder
URL: https://github.com/apache/beam/pull/10644#issuecomment-579265247
 
 
   @iemejia thanks ! I will switch to other IOs improvements ;)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378258)
Time Spent: 8h 40m  (was: 8.5h)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378269
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:26
Start Date: 28/Jan/20 14:26
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579270595
 
 
   I want that job too! The challenge is that because of the many existing 
linkage errors, I'd have to compare
   
   - linkage errors in a PR, and
   - linkage errors in origin/master
   
   Like a code coverage report. As I don't know how to do that, I'm still doing 
it `diff` command in my local environment.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378269)
Time Spent: 1h 50m  (was: 1h 40m)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378278&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378278
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:38
Start Date: 28/Jan/20 14:38
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579276082
 
 
   Well the manual comparison is not ideal but we can cope with that for the 
moment, what I don't want is to type the command for the 31 modules of this PR 
and then have to change it for other dependency upgrade. I just want some sort 
of `./gradlew :checkJavaLinkage` that works for the whole set of modules of the 
project. Is this 'feasible' with gradlew + Beam?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378278)
Time Spent: 2h  (was: 1h 50m)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378280&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378280
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:48
Start Date: 28/Jan/20 14:48
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on issue #10644: [BEAM-7427] 
Refactore JmsCheckpointMark to be usage via Coder
URL: https://github.com/apache/beam/pull/10644#issuecomment-579280837
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378280)
Time Spent: 8h 50m  (was: 8h 40m)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378282&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378282
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:50
Start Date: 28/Jan/20 14:50
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10644: [BEAM-7427] Refactor 
JmsCheckpointMark to use SerializableCoder
URL: https://github.com/apache/beam/pull/10644#issuecomment-579281715
 
 
   Merged manually, thanks again JB!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378282)
Time Spent: 9h 10m  (was: 9h)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378281
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:50
Start Date: 28/Jan/20 14:50
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10644: [BEAM-7427] 
Refactor JmsCheckpointMark to use SerializableCoder
URL: https://github.com/apache/beam/pull/10644
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378281)
Time Spent: 9h  (was: 8h 50m)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-7427.

Resolution: Fixed

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7427) JmsCheckpointMark Avro Serialization issue with UnboundedSource

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?focusedWorklogId=378284&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378284
 ]

ASF GitHub Bot logged work on BEAM-7427:


Author: ASF GitHub Bot
Created on: 28/Jan/20 14:52
Start Date: 28/Jan/20 14:52
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #8757: [BEAM-7427] Fix 
JmsCheckpointMark Avro Encoding
URL: https://github.com/apache/beam/pull/8757#issuecomment-579282755
 
 
   For info #10644 was merged today. The fix will be part of Beam 2.20.0 since 
the vote for 2.19.0 has already started.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378284)
Time Spent: 9h 20m  (was: 9h 10m)

> JmsCheckpointMark Avro Serialization issue with UnboundedSource
> ---
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7691) Improve checkpoints documentation

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7691:
---
Summary: Improve checkpoints documentation  (was: 
UnboundedSource.CheckpointMark should mention that implementations should be 
Serializable or have have an associated Coder)

> Improve checkpoints documentation
> -
>
> Key: BEAM-7691
> URL: https://issues.apache.org/jira/browse/BEAM-7691
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Priority: Minor
>  Labels: documentation, javadoc, newbie, starter
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7691) Improve checkpoints documentation

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7691:
---
Component/s: website

> Improve checkpoints documentation
> -
>
> Key: BEAM-7691
> URL: https://issues.apache.org/jira/browse/BEAM-7691
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, website
>Reporter: Ismaël Mejía
>Priority: Minor
>  Labels: documentation, javadoc, newbie, starter
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7691) Improve checkpoints documentation

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7691:
---
Description: UnboundedSource.CheckpointMark should mention that 
implementations it should be encodable (have an associated Coder). Also maybe 
it is a good idea to explain a bit more the checkpointing semantics on Beam.

> Improve checkpoints documentation
> -
>
> Key: BEAM-7691
> URL: https://issues.apache.org/jira/browse/BEAM-7691
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, website
>Reporter: Ismaël Mejía
>Priority: Minor
>  Labels: documentation, javadoc, newbie, starter
>
> UnboundedSource.CheckpointMark should mention that implementations it should 
> be encodable (have an associated Coder). Also maybe it is a good idea to 
> explain a bit more the checkpointing semantics on Beam.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378292&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378292
 ]

ASF GitHub Bot logged work on BEAM-9132:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:19
Start Date: 28/Jan/20 15:19
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #10694: [BEAM-9132] Use a unique 
bundle id across all SDK workers bound to the same environment
URL: https://github.com/apache/beam/pull/10694#issuecomment-579296135
 
 
   This does not fix the issue because the id generation scheme is effectively 
the same. I'm still seeing the same errors, but I have a new trace.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378292)
Time Spent: 1h 20m  (was: 1h 10m)

> State request handler is removed prematurely when closing ActiveBundle
> --
>
> Key: BEAM-9132
> URL: https://issues.apache.org/jira/browse/BEAM-9132
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We have observed these errors in a state-intense application: 
> {noformat}
> Error processing instruction 107. Original traceback is
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 780, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 587, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 659, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
>   File "apache_beam/runners/common.py", line 880, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "apache_beam/runners/common.py", line 895, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "redacted.py", line 56, in process
> recent_events_map = load_recent_events_map(recent_events_state)
>   File "redacted.py", line 128, in _load_recent_events_map
> items_in_recent_events_bag = list(recent_events_state.read())
>   File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__
> for elem in self.first:
>   File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__
> self._state_key, self._coder_impl, is_cached=self._is_cached)
>   File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get
> self._materialize_iter(state_key, coder))
>   File "apache_beam/runners/worker/sdk_worker.py", line 723, in 
> _materialize_iter
> self._underlying.get_raw(state_key, continuation_token)
>   File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw
> continuation_token=continuation_token)))
>   File "apache_beam/runners/worker/sdk_worker.py", line 637, in 
> _blocking_request
> raise RuntimeError(response.error)
> RuntimeError: Unknown process bundle instruction id '107'
> {noformat}
> Notice that the error is thrown on the Runner side. It seems to originate 
> from the {{ActiveBundle}} de-registering the state request handler too early 
> when the processing may still be going on in the SDK Harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7427) JmsCheckpointMark can not be correctly encoded

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7427:
---
Summary: JmsCheckpointMark can not be correctly encoded  (was: 
JmsCheckpointMark Avro Serialization issue with UnboundedSource)

> JmsCheckpointMark can not be correctly encoded
> --
>
> Key: BEAM-7427
> URL: https://issues.apache.org/jira/browse/BEAM-7427
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.12.0, 2.13.0, 2.14.0, 2.15.0, 2.16.0, 2.17.0, 2.18.0, 
> 2.19.0
> Environment: Message Broker : solace
> JMS Client (Over AMQP) : "org.apache.qpid:qpid-jms-client:0.42.0
>Reporter: Mourad
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> I get the following exception when reading from unbounded JMS Source:
>   
> {code:java}
> Caused by: org.apache.avro.SchemaParseException: Illegal character in: this$0
> at org.apache.avro.Schema.validateName(Schema.java:1151)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.(Schema.java:403)
> at org.apache.avro.Schema$Field.(Schema.java:396)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:622)
> at org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:740)
> at org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:604)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:218)
> at org.apache.avro.specific.SpecificData$2.load(SpecificData.java:215)
> at 
> avro.shaded.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> at 
> avro.shaded.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> {code}
>  
> The exception is thrown by Avro when introspecting {{JmsCheckpointMark}} to 
> generate schema.
> JmsIO config :
>  
> {code:java}
> PCollection messages = pipeline.apply("read messages from the 
> events broker", JmsIO.readMessage() 
> .withConnectionFactory(jmsConnectionFactory) .withTopic(options.getTopic()) 
> .withMessageMapper(new DFAMessageMapper()) 
> .withCoder(AvroCoder.of(DFAMessage.class)));
> {code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378293
 ]

ASF GitHub Bot logged work on BEAM-9175:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:20
Start Date: 28/Jan/20 15:20
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on pull request #10684: [BEAM-9175] 
Introduce an autoformatting tool to Python SDK
URL: https://github.com/apache/beam/pull/10684#discussion_r371867218
 
 

 ##
 File path: sdks/python/apache_beam/typehints/trivial_inference.py
 ##
 @@ -68,24 +68,23 @@ def instance_to_type(o):
 return typehints.Tuple[[instance_to_type(item) for item in o]]
   elif t == list:
 if len(o) > 0:
-  return typehints.List[
-  typehints.Union[[instance_to_type(item) for item in o]]
-  ]
+  return typehints.List[typehints.Union[[
+  instance_to_type(item) for item in o
+  ]]]
 else:
   return typehints.List[typehints.Any]
   elif t == set:
 if len(o) > 0:
-  return typehints.Set[
-  typehints.Union[[instance_to_type(item) for item in o]]
-  ]
+  return typehints.Set[typehints.Union[[
+  instance_to_type(item) for item in o
+  ]]]
 else:
   return typehints.Set[typehints.Any]
   elif t == dict:
 if len(o) > 0:
   return typehints.Dict[
   typehints.Union[[instance_to_type(k) for k, v in o.items()]],
-  typehints.Union[[instance_to_type(v) for k, v in o.items()]],
-  ]
+  typehints.Union[[instance_to_type(v) for k, v in o.items()]], ]
 
 Review comment:
   It's OK now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378293)
Time Spent: 2h 50m  (was: 2h 40m)

> Introduce an autoformatting tool to Python SDK
> --
>
> Key: BEAM-9175
> URL: https://issues.apache.org/jira/browse/BEAM-9175
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Michał Walenia
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> It seems there are three main options:
>  * black - very simple, but not configurable at all (except for line length), 
> would drastically change code style
>  * yapf - more options to tweak, can omit parts of code
>  * autopep8 - more similar to spotless - only touches code that breaks 
> formatting guidelines, can use pycodestyle and flake8 as configuration
>  The rigidity of Black makes it unusable for Beam.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378294&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378294
 ]

ASF GitHub Bot logged work on BEAM-9175:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:20
Start Date: 28/Jan/20 15:20
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on pull request #10684: [BEAM-9175] 
Introduce an autoformatting tool to Python SDK
URL: https://github.com/apache/beam/pull/10684#discussion_r371867339
 
 

 ##
 File path: sdks/python/apache_beam/typehints/trivial_inference.py
 ##
 @@ -303,10 +302,8 @@ def infer_return_type(c, input_types, debug=False, 
depth=5):
 elif inspect.isclass(c):
   if c in typehints.DISALLOWED_PRIMITIVE_TYPES:
 return {
-list: typehints.List[Any],
-set: typehints.Set[Any],
-tuple: typehints.Tuple[Any, ...],
-dict: typehints.Dict[Any, Any]
+list: typehints.List[Any], set: typehints.Set[Any], tuple:
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378294)
Time Spent: 3h  (was: 2h 50m)

> Introduce an autoformatting tool to Python SDK
> --
>
> Key: BEAM-9175
> URL: https://issues.apache.org/jira/browse/BEAM-9175
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Michał Walenia
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> It seems there are three main options:
>  * black - very simple, but not configurable at all (except for line length), 
> would drastically change code style
>  * yapf - more options to tweak, can omit parts of code
>  * autopep8 - more similar to spotless - only touches code that breaks 
> formatting guidelines, can use pycodestyle and flake8 as configuration
>  The rigidity of Black makes it unusable for Beam.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=378322&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378322
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:24
Start Date: 28/Jan/20 15:24
Worklog Time Spent: 10m 
  Work Description: stankiewicz commented on pull request #10701: 
[BEAM-9188] CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701
 
 
   
   Splitting CassandraIO source into multiple sources works fast as it uses one 
connection pool to Cassandra cluster but after that 
dataflow.worker.WorkerCustomSources is calling 
CassandraSource.getEstimatedSizeBytes for each source which setups and tears 
down connection to Cassandra cluster to calculate same size of table. This 
optimization introduces caching of size internally just to avoid additional 
queries. 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStrea

[jira] [Resolved] (BEAM-4409) NoSuchMethodException reading from JmsIO

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-4409.

Fix Version/s: 2.20.0
   Resolution: Fixed

> NoSuchMethodException reading from JmsIO
> 
>
> Key: BEAM-4409
> URL: https://issues.apache.org/jira/browse/BEAM-4409
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Affects Versions: 2.4.0
> Environment: Linux, Java 1.8, Beam 2.4, Direct Runner, ActiveMQ
>Reporter: Edward Pricer
>Priority: Major
> Fix For: 2.20.0
>
>
> Running with the DirectRunner, and reading from a queue with JmsIO as an 
> unbounded source will produce a NoSuchMethodException. This occurs as the 
> UnboundedReadEvaluatorFactory.UnboundedReadEvaluator attempts to clone the 
> JmsCheckpointMark with the default (Avro) coder.
> The following trivial code on the reader side reproduces the error 
> (DirectRunner must be in path). The messages on the queue for this test case 
> were simple TextMessages. I found this exception is triggered more readily 
> when messages are published rapidly (~200/second)
> {code:java}
> Pipeline p = 
> Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
> // read from the queue
> ConnectionFactory factory = new
> ActiveMQConnectionFactory("tcp://localhost:61616");
> PCollection inputStrings = p.apply("Read from queue",
> JmsIO.readMessage() .withConnectionFactory(factory)
> .withQueue("somequeue") .withCoder(StringUtf8Coder.of())
> .withMessageMapper((JmsIO.MessageMapper) message ->
> ((TextMessage) message).getText()));
> // decode 
> PCollection asStrings = inputStrings.apply("Decode Message", 
> ParDo.of(new DoFn() { @ProcessElement public
> void processElement(ProcessContext context) {
> System.out.println(context.element());
> context.output(context.element()); } })); 
> p.run();
> {code}
> Stack trace:
> {code:java}
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.NoSuchMethodException: javax.jms.Message.() at 
> org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:353) at 
> org.apache.avro.specific.SpecificData.newRecord(SpecificData.java:369) at 
> org.apache.avro.reflect.ReflectData.newRecord(ReflectData.java:901) at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:212)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.reflect.ReflectDatumReader.readCollection(ReflectDatumReader.java:219)
>  at 
> org.apache.avro.reflect.ReflectDatumReader.readArray(ReflectDatumReader.java:137)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:177)
>  at 
> org.apache.avro.reflect.ReflectDatumReader.readField(ReflectDatumReader.java:302)
>  at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153) 
> at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145) 
> at org.apache.beam.sdk.coders.AvroCoder.decode(AvroCoder.java:318) at 
> org.apache.beam.sdk.coders.Coder.decode(Coder.java:170) at 
> org.apache.beam.sdk.util.CoderUtils.decodeFromSafeStream(CoderUtils.java:122) 
> at 
> org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:105) 
> at 
> org.apache.beam.sdk.util.CoderUtils.decodeFromByteArray(CoderUtils.java:99) 
> at org.apache.beam.sdk.util.CoderUtils.clone(CoderUtils.java:148) at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:194)
>  at 
> org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:124)
>  at 
> org.apache.beam.runners.direct.DirectTransformExecutor.processElements(DirectTransformExecutor.java:161)
>  at 
> org.apache.beam.runners.direct.DirectTransformExecutor.run(DirectTransformExecutor.java:125)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748) Caused by: 
> java.lang.NoSuchMethodException: javax.jms.Message.() at 
> java.lang.Class.getConstructor0(Class.java:3082) at 
> java.lang.Class.getDeclaredConstructor(Class.java:2178) at 
> org.apache.avro.specific.SpecificData.newInstance(SpecificData.java:347)
> {code}
>  
> And a more contrived exampl

[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378323&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378323
 ]

ASF GitHub Bot logged work on BEAM-9175:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:30
Start Date: 28/Jan/20 15:30
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #10684: [BEAM-9175] 
Introduce an autoformatting tool to Python SDK
URL: https://github.com/apache/beam/pull/10684#issuecomment-579301469
 
 
   @robertwb I managed to add all knobs you asked for. I think the code looks 
better now (much less weird lines, that's for sure).
   
   I also added a pre-commit job that runs yapf with --diff option.

 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378323)
Time Spent: 3h 10m  (was: 3h)

> Introduce an autoformatting tool to Python SDK
> --
>
> Key: BEAM-9175
> URL: https://issues.apache.org/jira/browse/BEAM-9175
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Michał Walenia
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> It seems there are three main options:
>  * black - very simple, but not configurable at all (except for line length), 
> would drastically change code style
>  * yapf - more options to tweak, can omit parts of code
>  * autopep8 - more similar to spotless - only touches code that breaks 
> formatting guidelines, can use pycodestyle and flake8 as configuration
>  The rigidity of Black makes it unusable for Beam.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=378326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378326
 ]

ASF GitHub Bot logged work on BEAM-9175:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:33
Start Date: 28/Jan/20 15:33
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #10684: [BEAM-9175] 
Introduce an autoformatting tool to Python SDK
URL: https://github.com/apache/beam/pull/10684#issuecomment-579303250
 
 
   There is still a number of pylint issues `Wrong continued indentation`. Most 
of them appear because of how lambda is formatted. For example:
   
https://github.com/apache/beam/blob/7db61fbf2dd6eac4ffb542e684260edf0d892fea/sdks/python/apache_beam/io/gcp/bigquery_test.py#L908-L910
   
   I don't know yet how to deal with it. Unless there is a knob for this, I'll 
just put a # yapf: disable comments in these places.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378326)
Time Spent: 3h 20m  (was: 3h 10m)

> Introduce an autoformatting tool to Python SDK
> --
>
> Key: BEAM-9175
> URL: https://issues.apache.org/jira/browse/BEAM-9175
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Michał Walenia
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> It seems there are three main options:
>  * black - very simple, but not configurable at all (except for line length), 
> would drastically change code style
>  * yapf - more options to tweak, can omit parts of code
>  * autopep8 - more similar to spotless - only touches code that breaks 
> formatting guidelines, can use pycodestyle and flake8 as configuration
>  The rigidity of Black makes it unusable for Beam.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378332&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378332
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:47
Start Date: 28/Jan/20 15:47
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579313211
 
 
   Let me think about that this week. 
https://issues.apache.org/jira/browse/BEAM-9206
   
   For this PR, I would only check the modules that use jackson: `find . -name 
'build.gradle' | xargs grep library.java.jackson_`.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378332)
Time Spent: 2h 10m  (was: 2h)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9206) Easy way to run checkJavaLinkage?

2020-01-28 Thread Tomo Suzuki (Jira)
Tomo Suzuki created BEAM-9206:
-

 Summary: Easy way to run checkJavaLinkage?
 Key: BEAM-9206
 URL: https://issues.apache.org/jira/browse/BEAM-9206
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Tomo Suzuki
Assignee: Tomo Suzuki


Follow up of iemejia's comment: 
https://github.com/apache/beam/pull/10643#issuecomment-579276082

bq.  I just want some sort of ./gradlew :checkJavaLinkage that works for the 
whole set of modules of the project. Is this 'feasible' with gradlew + Beam?

h1. Considerations

* Something that can run on Jenkins
* Comparison with the result of origin/master
* Simple way to run checkJavaLinkage for all modules




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9008) Add readAll() method to CassandraIO

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9008?focusedWorklogId=378338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378338
 ]

ASF GitHub Bot logged work on BEAM-9008:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:55
Start Date: 28/Jan/20 15:55
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10546: [BEAM-9008] Add 
CassandraIO readAll method
URL: https://github.com/apache/beam/pull/10546#issuecomment-579317145
 
 
   > @iemejia I think that could work, thanks for your patience as I try to 
understand what you're thinking. Some questions:
   
   No, thanks to you who has been the patient one during this discussion.
   
   > 1. If our `ReadFn extends DoFn, A>` and the only way we 
have connection information is from the `Read` passed in to the 
processElement, that means we need to re-establish a DB connection for each  
batch of queries we run?  As in, the connection would be established in the 
`processElement` method and could not be in `setup` method?
   
   Yes exactly this will make the method simpler and the cost of starting a 
connection gets amortized by the processElement producing multiple outputs from 
a single connection.
   
   > 2. How would that work for the end user of a 
`PTransform, PCollection>`?  Here is what I did in the 
test and would wand to document how end users could generate 'queries',
   >
https://github.com/apache/beam/pull/10546/files#diff-8ba4ea3b09d563a67e29ff8584269d35R499
   >Would we instead want to return a `PCollection>` by using 
something like `return CassandraIO.read().withRingRange(new RingRange(start, 
finish))`?  If we do that however, we'd need to do the `withHosts` and all the 
other connection information, no?  The other option is establishing one 
`ReadAll` PTransform that maps over the `Read` input and enriches the db 
connection information?
   
   You have a point here!. We need `class ReadAll extends 
PTransform, PCollection>` and there we read as intended 
with `ReadFn`. You would have to modify however the `expand` of `Read` to do 
`input.apply(Create.of(this)).apply(CassandraIO.readAll())` where `ReadAll` 
should expand into 
`input.apply(ParDo.of(splitFn)).apply(Reshuffle).apply(Read)` users should deal 
with building the PCollection of `Reads` before passing that collection to 
`ReadAll`.
   
   > 3. Originally I had wanted to have the ReadFn operate on a 
_collection_ of 'query' objects to ensure a way to enforce linearizability with 
our queries (mainly so we don't oversaturate a single node/shard).  Currently 
the groupBy function a user passes in operates on the `RingRange` object, would 
we keep it that way and just, under the hood, allow for a single `Read` to 
hold a collection of RingRanges?
   
   If I understand this correctly this is covered by following the Create -> 
Split -> Reshuffle -> Read pattern mentioned above (in the mentioned IOs). So 
Split is the one who will generate a collection of `Read`s for each given 
`RingRange` then we use Reshuffle to guarantee that reads are redistributed and 
finally each read request is read by one worker.
   
   Hope this helps, don't hesitate to ask me more questions if still. I will 
try to answer quickly this time.
   Hope this helps
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378338)
Time Spent: 4h  (was: 3h 50m)

> Add readAll() method to CassandraIO
> ---
>
> Key: BEAM-9008
> URL: https://issues.apache.org/jira/browse/BEAM-9008
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-cassandra
>Affects Versions: 2.16.0
>Reporter: vincent marquez
>Assignee: vincent marquez
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> When querying a large cassandra database, it's often *much* more useful to 
> programatically generate the queries needed to to be run rather than reading 
> all partitions and attempting some filtering.  
> As an example:
> {code:java}
> public class Event { 
>@PartitionKey(0) public UUID accountId;
>@PartitionKey(1)public String yearMonthDay; 
>@ClusteringKey public UUID eventId;  
>//other data...
> }{code}
> If there is ten years worth of data, you may want to only query one year's 
> worth.  Here each token range would represent one 'token' but all events for 
> the day. 
> {code:java}
> Set accounts = getRelevantAccounts();
> Set dateRange = generateDateRang

[jira] [Work logged] (BEAM-9008) Add readAll() method to CassandraIO

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9008?focusedWorklogId=378339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378339
 ]

ASF GitHub Bot logged work on BEAM-9008:


Author: ASF GitHub Bot
Created on: 28/Jan/20 15:58
Start Date: 28/Jan/20 15:58
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10546: [BEAM-9008] Add 
CassandraIO readAll method
URL: https://github.com/apache/beam/pull/10546#issuecomment-579318866
 
 
   Forgot to mention that in the above comment that in the Split function you 
have to split in every case save if the user provided a specific RingRange to 
read from.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378339)
Time Spent: 4h 10m  (was: 4h)

> Add readAll() method to CassandraIO
> ---
>
> Key: BEAM-9008
> URL: https://issues.apache.org/jira/browse/BEAM-9008
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-cassandra
>Affects Versions: 2.16.0
>Reporter: vincent marquez
>Assignee: vincent marquez
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> When querying a large cassandra database, it's often *much* more useful to 
> programatically generate the queries needed to to be run rather than reading 
> all partitions and attempting some filtering.  
> As an example:
> {code:java}
> public class Event { 
>@PartitionKey(0) public UUID accountId;
>@PartitionKey(1)public String yearMonthDay; 
>@ClusteringKey public UUID eventId;  
>//other data...
> }{code}
> If there is ten years worth of data, you may want to only query one year's 
> worth.  Here each token range would represent one 'token' but all events for 
> the day. 
> {code:java}
> Set accounts = getRelevantAccounts();
> Set dateRange = generateDateRange("2018-01-01", "2019-01-01");
> PCollection tokens = generateTokens(accounts, dateRange); 
> {code}
>  
>  I propose an additional _readAll()_ PTransform that can take a PCollection 
> of token ranges and can return a PCollection of what the query would 
> return. 
> *Question: How much code should be in common between both methods?* 
> Currently the read connector already groups all partitions into a List of 
> Token Ranges, so it would be simple to refactor the current read() based 
> method to a 'ParDo' based one and have them both share the same function.  
> Reasons against sharing code between read and readAll
>  * Not having the read based method return a BoundedSource connector would 
> mean losing the ability to know the size of the data returned
>  * Currently the CassandraReader executes all the grouped TokenRange queries 
> *asynchronously* which is (maybe?) fine when all that's happening is 
> splitting up all the partition ranges but terrible for executing potentially 
> millions of queries. 
>  Reasons _for_ sharing code would be simplified code base and that both of 
> the above issues would most likely have a negligable performance impact. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9206) Easy way to run checkJavaLinkage?

2020-01-28 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9206:
---
Status: Open  (was: Triage Needed)

> Easy way to run checkJavaLinkage?
> -
>
> Key: BEAM-9206
> URL: https://issues.apache.org/jira/browse/BEAM-9206
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
>
> Follow up of iemejia's comment: 
> https://github.com/apache/beam/pull/10643#issuecomment-579276082
> bq.  I just want some sort of ./gradlew :checkJavaLinkage that works for the 
> whole set of modules of the project. Is this 'feasible' with gradlew + Beam?
> h1. Considerations
> * Something that can run on Jenkins
> * Comparison with the result of origin/master
> * Simple way to run checkJavaLinkage for all modules



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378340
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 16:00
Start Date: 28/Jan/20 16:00
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579320038
 
 
   Hehe so the 31 that I mentioned above, mmm not an easy to sell proposition. 
On the other hand I can help with the jenkins part if you get to do an 
incantation that works locally for all modules.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378340)
Time Spent: 2h 20m  (was: 2h 10m)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8298) Implement state caching for side inputs

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8298?focusedWorklogId=378345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378345
 ]

ASF GitHub Bot logged work on BEAM-8298:


Author: ASF GitHub Bot
Created on: 28/Jan/20 16:10
Start Date: 28/Jan/20 16:10
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10679: [BEAM-8298] Fully 
specify the necessary details to support side input caching.
URL: https://github.com/apache/beam/pull/10679#issuecomment-579326907
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378345)
Time Spent: 1h 10m  (was: 1h)

> Implement state caching for side inputs
> ---
>
> Key: BEAM-8298
> URL: https://issues.apache.org/jira/browse/BEAM-8298
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-py-harness
>Reporter: Maximilian Michels
>Assignee: Jing Chen
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Caching is currently only implemented for user state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8542) Add async write to AWS SNS IO & remove retry logic

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8542?focusedWorklogId=378362&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378362
 ]

ASF GitHub Bot logged work on BEAM-8542:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:28
Start Date: 28/Jan/20 17:28
Worklog Time Spent: 10m 
  Work Description: Akshay-Iyangar commented on issue #10078: [BEAM-8542] 
Change write to async in AWS SNS IO & remove retry logic
URL: https://github.com/apache/beam/pull/10078#issuecomment-579364116
 
 
   HI @aromanenko-dev - I'll be fixing the remaining stuff in this PR and will 
let you know as soon at is ready.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378362)
Time Spent: 4h 40m  (was: 4.5h)

> Add async write to AWS SNS IO & remove retry logic
> --
>
> Key: BEAM-8542
> URL: https://issues.apache.org/jira/browse/BEAM-8542
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-aws
>Reporter: Ajo Thomas
>Assignee: Ajo Thomas
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> - While working with SNS IO for one of my work-related projects, I found that 
> the IO uses synchronous publishes during writes. I had a simple mock pipeline 
> where I was reading from a kinesis stream and publishing it to SNS using 
> Beam's SNS IO. For comparison, I also had a lamdba which did the same using 
> asynchronous publishes but was about 5x faster. Changing the SNS IO to use 
> async publishes would improve publish latencies.
>  - SNS IO also has some retry logic which isn't required as SNS clients can 
> handle retries. The retry logic in the SNS client is user-configurable and 
> therefore, an explicit retry logic in SNS IO is not required
> I have a working version of the IO with these changes, will create a PR 
> linking this ticket to it once I get some feedback here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378366&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378366
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:41
Start Date: 28/Jan/20 17:41
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579369953
 
 
   Run PythonLint PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378366)
Time Spent: 43h 10m  (was: 43h)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 43h 10m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378367&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378367
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:42
Start Date: 28/Jan/20 17:42
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579369985
 
 
   Run PythonLint PreCommit
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378367)
Time Spent: 43h 20m  (was: 43h 10m)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 43h 20m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8298) Implement state caching for side inputs

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8298?focusedWorklogId=378369&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378369
 ]

ASF GitHub Bot logged work on BEAM-8298:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:48
Start Date: 28/Jan/20 17:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10679: [BEAM-8298] 
Fully specify the necessary details to support side input caching.
URL: https://github.com/apache/beam/pull/10679
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378369)
Time Spent: 1h 20m  (was: 1h 10m)

> Implement state caching for side inputs
> ---
>
> Key: BEAM-8298
> URL: https://issues.apache.org/jira/browse/BEAM-8298
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-py-harness
>Reporter: Maximilian Michels
>Assignee: Jing Chen
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Caching is currently only implemented for user state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378378
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:54
Start Date: 28/Jan/20 17:54
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10702: [BEAM-5605] 
Migrate splittable DoFn methods to use "new" DoFn style argument providing.
URL: https://github.com/apache/beam/pull/10702
 
 
   Defined a new @Restriction parameter type to pass in the restriction.
   Updated GetSize/GetInitialRestriction/SplitRestriction/NewTracker to take 
these new DoFn style parameters.
   Updated lots of documentation and existing implementations to use the new 
DoFn argument passing.
   Fixed ByteBuddyDoFnInvokerFactory to support return values that aren't 
references.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://build

[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378379
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:56
Start Date: 28/Jan/20 17:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate 
splittable DoFn methods to use "new" DoFn style argument providing.
URL: https://github.com/apache/beam/pull/10702#issuecomment-579376066
 
 
   R: @TheNeuralBit @boyuanzz 
   CC: @iemejia @robertwb 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378379)
Time Spent: 8h 40m  (was: 8.5h)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9207) Create a script to define all variables used by release scripts

2020-01-28 Thread Hannah Jiang (Jira)
Hannah Jiang created BEAM-9207:
--

 Summary: Create a script to define all variables used by release 
scripts
 Key: BEAM-9207
 URL: https://issues.apache.org/jira/browse/BEAM-9207
 Project: Beam
  Issue Type: Task
  Components: dependencies
Reporter: Hannah Jiang


Now we are defining variables with each script and this cause the definitions 
are duplicated at each script. We should have a place which defines all these 
variables and shared by all scripts for release.
* put it to dependencies component, because there is no release component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378380
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:57
Start Date: 28/Jan/20 17:57
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10576: [BEAM-5605] Convert 
all BoundedSources to SplittableDoFns when using beam_fn_api experiment.
URL: https://github.com/apache/beam/pull/10576#issuecomment-579376564
 
 
   I'm going to wait on #10702 and update this PR after that goes in since it 
is necessary to support the new DoFn style parameter passing.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378380)
Time Spent: 8h 50m  (was: 8h 40m)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9063) Migrate docker images to apache namespace.

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9063?focusedWorklogId=378381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378381
 ]

ASF GitHub Bot logged work on BEAM-9063:


Author: ASF GitHub Bot
Created on: 28/Jan/20 17:58
Start Date: 28/Jan/20 17:58
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #10612: [DO NOT 
MERGE][BEAM-9063] migrate docker images to apache
URL: https://github.com/apache/beam/pull/10612#discussion_r371964232
 
 

 ##
 File path: release/src/main/scripts/build_release_candidate.sh
 ##
 @@ -45,6 +45,9 @@ PYTHON_ARTIFACTS_DIR=python
 BEAM_ROOT_DIR=beam
 WEBSITE_ROOT_DIR=beam-site
 
+DOCKER_IMAGE_DEFAULT_REPO_ROOT=apache
+DOCKER_IMAGE_DEFAULT_REPO_PREFIX=beam_
 
 Review comment:
   Yeap, that's a great idea and I think it's better to have a script which 
holds all variables and shared by all release scripts, so I haven't used a 
script this time. I created a ticket to fix it. 
[BEAM-9207](https://issues.apache.org/jira/browse/BEAM-9207)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378381)
Time Spent: 5h 50m  (was: 5h 40m)

> Migrate docker images to apache namespace.
> --
>
> Key: BEAM-9063
> URL: https://issues.apache.org/jira/browse/BEAM-9063
> Project: Beam
>  Issue Type: Task
>  Components: beam-community
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> https://hub.docker.com/u/apache



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7746) Add type hints to python code

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=378383&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378383
 ]

ASF GitHub Bot logged work on BEAM-7746:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:08
Start Date: 28/Jan/20 18:08
Worklog Time Spent: 10m 
  Work Description: chadrik commented on issue #10592: [BEAM-7746] 
Introduce a protocol to handle various types of partitioning buffers
URL: https://github.com/apache/beam/pull/10592#issuecomment-579381109
 
 
   All tests have passed.  This is ready to go. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378383)
Time Spent: 58.5h  (was: 58h 20m)

> Add type hints to python code
> -
>
> Key: BEAM-7746
> URL: https://issues.apache.org/jira/browse/BEAM-7746
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 58.5h
>  Remaining Estimate: 0h
>
> As a developer of the beam source code, I would like the code to use pep484 
> type hints so that I can clearly see what types are required, get completion 
> in my IDE, and enforce code correctness via a static analyzer like mypy.
> This may be considered a precursor to BEAM-7060
> Work has been started here:  [https://github.com/apache/beam/pull/9056]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6957) Spark Portable Runner: Support metrics

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6957?focusedWorklogId=378384&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378384
 ]

ASF GitHub Bot logged work on BEAM-6957:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:12
Start Date: 28/Jan/20 18:12
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10693: [BEAM-6957] 
Enable Counter/Distribution metrics tests for Portable Spark Runner
URL: https://github.com/apache/beam/pull/10693
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378384)
Time Spent: 2h 20m  (was: 2h 10m)

> Spark Portable Runner: Support metrics
> --
>
> Key: BEAM-6957
> URL: https://issues.apache.org/jira/browse/BEAM-6957
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
> Fix For: 2.13.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9203) Programmatically determine if SQL exception is user error, unsupported, or bug

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9203?focusedWorklogId=378386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378386
 ]

ASF GitHub Bot logged work on BEAM-9203:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:20
Start Date: 28/Jan/20 18:20
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #10699: [BEAM-9203] 
Clarify exceptions in SQL modules
URL: https://github.com/apache/beam/pull/10699#issuecomment-579385899
 
 
   tag @amaliujia so it appears on my pull request list.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378386)
Time Spent: 20m  (was: 10m)

> Programmatically determine if SQL exception is user error, unsupported, or bug
> --
>
> Key: BEAM-9203
> URL: https://issues.apache.org/jira/browse/BEAM-9203
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, dsl-sql-zetasql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now there are a lot exceptions thrown by the Calcite SQL dialect and 
> ZetaSQL dialect of Beam SQL. It is hard to catch just the errors that are 
> user errors, or just the errors that are unsupported operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378387
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:20
Start Date: 28/Jan/20 18:20
Worklog Time Spent: 10m 
  Work Description: rehmanmuradali commented on issue #10627: [BEAM-2535] 
Support outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579386009
 
 
   @reuvenlax really needs your input with the failed test cases as the 
functionality is completed. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378387)
Time Spent: 14h  (was: 13h 50m)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2535) Allow explicit output time independent of firing specification for all timers

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2535?focusedWorklogId=378388&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378388
 ]

ASF GitHub Bot logged work on BEAM-2535:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:20
Start Date: 28/Jan/20 18:20
Worklog Time Spent: 10m 
  Work Description: rehmanmuradali commented on issue #10627: [BEAM-2535] 
Support outputTimestamp and watermark holds in processing timers.
URL: https://github.com/apache/beam/pull/10627#issuecomment-579386009
 
 
   @reuvenlax really need your input with the failed test cases as the 
functionality is completed. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378388)
Time Spent: 14h 10m  (was: 14h)

> Allow explicit output time independent of firing specification for all timers
> -
>
> Key: BEAM-2535
> URL: https://issues.apache.org/jira/browse/BEAM-2535
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 14h 10m
>  Remaining Estimate: 0h
>
> Today, we have insufficient control over the event time timestamp of elements 
> output from a timer callback.
> 1. For an event time timer, it is the timestamp of the timer itself.
>  2. For a processing time timer, it is the current input watermark at the 
> time of processing.
> But for both of these, we may want to reserve the right to output a 
> particular time, aka set a "watermark hold".
> A naive implementation of a {{TimerWithWatermarkHold}} would work for making 
> sure output is not droppable, but does not fully explain window expiration 
> and late data/timer dropping.
> In the natural interpretation of a timer as a feedback loop on a transform, 
> timers should be viewed as another channel of input, with a watermark, and 
> items on that channel _all need event time timestamps even if they are 
> delivered according to a different time domain_.
> I propose that the specification for when a timer should fire should be 
> separated (with nice defaults) from the specification of the event time of 
> resulting outputs. These timestamps will determine a side channel with a new 
> "timer watermark" that constrains the output watermark.
>  - We still need to fire event time timers according to the input watermark, 
> so that event time timers fire.
>  - Late data dropping and window expiration will be in terms of the minimum 
> of the input watermark and the timer watermark. In this way, whenever a timer 
> is set, the window is not going to be garbage collected.
>  - We will need to make sure we have a way to "wake up" a window once it is 
> expired; this may be as simple as exhausting the timer channel as soon as the 
> input watermark indicates expiration of a window
> This is mostly aimed at end-user timers in a stateful+timely {{DoFn}}. It 
> seems reasonable to use timers as an implementation detail (e.g. in 
> runners-core utilities) without wanting any of this additional machinery. For 
> example, if there is no possibility of output from the timer callback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9162) Upgrade Jackson to version 2.10.2

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9162?focusedWorklogId=378390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378390
 ]

ASF GitHub Bot logged work on BEAM-9162:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:26
Start Date: 28/Jan/20 18:26
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10643: [BEAM-9162] Upgrade 
Jackson to version 2.10.2
URL: https://github.com/apache/beam/pull/10643#issuecomment-579388508
 
 
   If the linkage checker had a way to ignore *pre-existing* linkage failures, 
we could turn it into a test by enumerating all known failures statically. The 
linkage checker would complain if there was a *new* failure that wasn't 
pre-existing or if the pre-existing failure wasn't being reported anymore 
(allowing us to maintain the list over time).
   
   The vendored gRPC 1.26.0 reduced the number of warnings in 
beam-sdks-java-core down to 4. Also, running the linkage checker per module 
would be useful and I can help with the Gradle bit if there was some good way 
to have linkage checker main return non zero status code on linkage errors and 
also if it supported enumerating pre-existing somehow.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378390)
Time Spent: 2.5h  (was: 2h 20m)

> Upgrade Jackson to version 2.10.2
> -
>
> Key: BEAM-9162
> URL: https://issues.apache.org/jira/browse/BEAM-9162
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Jackson has a new way to deal with [deserialization security 
> issues|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.10] in 
> 2.10.x so worth the upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=378391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378391
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:28
Start Date: 28/Jan/20 18:28
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate 
splittable DoFn methods to use "new" DoFn style argument providing.
URL: https://github.com/apache/beam/pull/10702#issuecomment-579389168
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378391)
Time Spent: 9h  (was: 8h 50m)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378400
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:51
Start Date: 28/Jan/20 18:51
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579398909
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378400)
Time Spent: 43h 40m  (was: 43.5h)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 43h 40m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378399
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 18:51
Start Date: 28/Jan/20 18:51
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579398856
 
 
   Run PythonLint PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378399)
Time Spent: 43.5h  (was: 43h 20m)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 43.5h
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9203) Programmatically determine if SQL exception is user error, unsupported, or bug

2020-01-28 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-9203:
---
Status: Open  (was: Triage Needed)

> Programmatically determine if SQL exception is user error, unsupported, or bug
> --
>
> Key: BEAM-9203
> URL: https://issues.apache.org/jira/browse/BEAM-9203
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, dsl-sql-zetasql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now there are a lot exceptions thrown by the Calcite SQL dialect and 
> ZetaSQL dialect of Beam SQL. It is hard to catch just the errors that are 
> user errors, or just the errors that are unsupported operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7746) Add type hints to python code

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=378404&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378404
 ]

ASF GitHub Bot logged work on BEAM-7746:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:01
Start Date: 28/Jan/20 19:01
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #10592: [BEAM-7746] 
Introduce a protocol to handle various types of partitioning buffers
URL: https://github.com/apache/beam/pull/10592
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378404)
Time Spent: 58h 40m  (was: 58.5h)

> Add type hints to python code
> -
>
> Key: BEAM-7746
> URL: https://issues.apache.org/jira/browse/BEAM-7746
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 58h 40m
>  Remaining Estimate: 0h
>
> As a developer of the beam source code, I would like the code to use pep484 
> type hints so that I can clearly see what types are required, get completion 
> in my IDE, and enforce code correctness via a static analyzer like mypy.
> This may be considered a precursor to BEAM-7060
> Work has been started here:  [https://github.com/apache/beam/pull/9056]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378410&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378410
 ]

ASF GitHub Bot logged work on BEAM-9132:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:10
Start Date: 28/Jan/20 19:10
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #10694: [BEAM-9132] 
Avoid logging misleading error messages during pipeline failure
URL: https://github.com/apache/beam/pull/10694#discussion_r372000633
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java
 ##
 @@ -464,6 +474,8 @@ ServerInfo getServerInfo() {
 }
 
 public void close() {
 
 Review comment:
   Isn't close only called from unref? If so, how does this change the 
behavior? (Possibly some more explanation needs to be added.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378410)
Time Spent: 1.5h  (was: 1h 20m)

> State request handler is removed prematurely when closing ActiveBundle
> --
>
> Key: BEAM-9132
> URL: https://issues.apache.org/jira/browse/BEAM-9132
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We have observed these errors in a state-intense application: 
> {noformat}
> Error processing instruction 107. Original traceback is
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 780, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 587, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 659, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
>   File "apache_beam/runners/common.py", line 880, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "apache_beam/runners/common.py", line 895, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "redacted.py", line 56, in process
> recent_events_map = load_recent_events_map(recent_events_state)
>   File "redacted.py", line 128, in _load_recent_events_map
> items_in_recent_events_bag = list(recent_events_state.read())
>   File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__
> for elem in self.first:
>   File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__
> self._state_key, self._coder_impl, is_cached=self._is_cached)
>   File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get
> self._materialize_iter(state_key, coder))
>   File "apache_beam/runners/worker/sdk_worker.py", line 723, in 
> _materialize_iter
> self._underlying.get_raw(state_key, continuation_token)
>   File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw
> continuation_token=continuation_token)))
>   File "apache_beam/runners/worker/sdk_worker.py", line 637, in 
> _blocking_request
> raise RuntimeError(response.error)
> RuntimeError: Unknown process bundle instruction id '107'
> {noformat}
> Notice that the error is thrown on the Runner side. It seems to originate 
> from the {{ActiveBundle}} de-registering the state request handler too early 
> when the processing may still be going on in the SDK Harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378411&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378411
 ]

ASF GitHub Bot logged work on BEAM-9132:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:12
Start Date: 28/Jan/20 19:12
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #10694: [BEAM-9132] Avoid 
logging misleading error messages during pipeline failure
URL: https://github.com/apache/beam/pull/10694#discussion_r372001232
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java
 ##
 @@ -464,6 +474,8 @@ ServerInfo getServerInfo() {
 }
 
 public void close() {
 
 Review comment:
   It's now also called from here: 
https://github.com/apache/beam/pull/10694/files#diff-e80c769f0011537cc2b60d3e7898cf5aR260
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378411)
Time Spent: 1h 40m  (was: 1.5h)

> State request handler is removed prematurely when closing ActiveBundle
> --
>
> Key: BEAM-9132
> URL: https://issues.apache.org/jira/browse/BEAM-9132
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We have observed these errors in a state-intense application: 
> {noformat}
> Error processing instruction 107. Original traceback is
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 780, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 587, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 659, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
>   File "apache_beam/runners/common.py", line 880, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "apache_beam/runners/common.py", line 895, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "redacted.py", line 56, in process
> recent_events_map = load_recent_events_map(recent_events_state)
>   File "redacted.py", line 128, in _load_recent_events_map
> items_in_recent_events_bag = list(recent_events_state.read())
>   File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__
> for elem in self.first:
>   File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__
> self._state_key, self._coder_impl, is_cached=self._is_cached)
>   File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get
> self._materialize_iter(state_key, coder))
>   File "apache_beam/runners/worker/sdk_worker.py", line 723, in 
> _materialize_iter
> self._underlying.get_raw(state_key, continuation_token)
>   File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw
> continuation_token=continuation_token)))
>   File "apache_beam/runners/worker/sdk_worker.py", line 637, in 
> _blocking_request
> raise RuntimeError(response.error)
> RuntimeError: Unknown process bundle instruction id '107'
> {noformat}
> Notice that the error is thrown on the Runner side. It seems to originate 
> from the {{ActiveBundle}} de-registering the state request handler too early 
> when the processing may still be going on in the SDK Harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378414&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378414
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:13
Start Date: 28/Jan/20 19:13
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579408205
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378414)
Time Spent: 44h  (was: 43h 50m)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 44h
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378413&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378413
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:13
Start Date: 28/Jan/20 19:13
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579408104
 
 
   Retest this please.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378413)
Time Spent: 43h 50m  (was: 43h 40m)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 43h 50m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378412&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378412
 ]

ASF GitHub Bot logged work on BEAM-9132:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:13
Start Date: 28/Jan/20 19:13
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #10694: [BEAM-9132] 
Avoid logging misleading error messages during pipeline failure
URL: https://github.com/apache/beam/pull/10694#discussion_r372001730
 
 

 ##
 File path: 
runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactoryTest.java
 ##
 @@ -360,6 +360,19 @@ public void closesEnvironmentOnCleanup() throws Exception 
{
 verify(remoteEnvironment).close();
   }
 
+  @Test
+  public void closesEnvironmentOnCleanupWithPendingRefs() throws Exception {
+try (DefaultJobBundleFactory bundleFactory =
+createDefaultJobBundleFactory(envFactoryProviderMap)) {
+  DefaultJobBundleFactory.SimpleStageBundleFactory stageBundleFactory =
+  (DefaultJobBundleFactory.SimpleStageBundleFactory)
+  bundleFactory.forStage(getExecutableStage(environment));
+  // The client is still being used, e.g. when the pipeline fails and is 
shut down
+  stageBundleFactory.currentClient.wrappedClient.ref();
 
 Review comment:
   What would cause this extra ref in the actual operator lifecycle?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378412)
Time Spent: 1h 50m  (was: 1h 40m)

> State request handler is removed prematurely when closing ActiveBundle
> --
>
> Key: BEAM-9132
> URL: https://issues.apache.org/jira/browse/BEAM-9132
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We have observed these errors in a state-intense application: 
> {noformat}
> Error processing instruction 107. Original traceback is
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 780, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 587, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 659, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
>   File "apache_beam/runners/common.py", line 880, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "apache_beam/runners/common.py", line 895, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "redacted.py", line 56, in process
> recent_events_map = load_recent_events_map(recent_events_state)
>   File "redacted.py", line 128, in _load_recent_events_map
> items_in_recent_events_bag = list(recent_events_state.read())
>   File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__
> for elem in self.first:
>   File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__
> self._state_key, self._coder_impl, is_cached=self._is_cached)
>   File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get
> self._materialize_iter(state_key, coder))
>   File "apache_beam/runners/worker/sdk_worker.py", line 723, in 
> _materialize_iter
> self._underlying.get_raw(state_key, continuation_token)
>   File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw
> continuation_token=continuation_token)))
>   File "apache_beam/runners/worker/sdk_worker.py", line 637, in 
> _blocking_request
> raise RuntimeError(response.error)
> RuntimeError: Unknown process bundle instruction id '107'
> {noformat}
> Notice that the error is thrown on the Runner side. It seems to originate 
> from the {{ActiveBundle}} de-registering the state request handler too early 
> when the processing may still be going on in the SDK Harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Show PCollection with Interactive Beam in a data-centric user flow

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=378415&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378415
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:13
Start Date: 28/Jan/20 19:13
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on issue #10346: [BEAM-7926] 
Data-centric Interactive Part2
URL: https://github.com/apache/beam/pull/10346#issuecomment-579408305
 
 
   Run PythonLint PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378415)
Time Spent: 44h 10m  (was: 44h)

> Show PCollection with Interactive Beam in a data-centric user flow
> --
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 44h 10m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
>  
> {code:java}
> p = beam.Pipeline(InteractiveRunner())
> pcoll = p | 'Transform' >> transform()
> pcoll2 = ...
> pcoll3 = ...{code}
> The use can call a single function and get auto-magical charting of the data.
> e.g.,
> {code:java}
> show(pcoll, pcoll2)
> {code}
> Throughout the process, a pipeline fragment is built to include only 
> transforms necessary to produce the desired pcolls (pcoll and pcoll2) and 
> execute that fragment.
> This makes the Interactive Beam user flow data-centric.
>  
> Detailed 
> [design|https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit#heading=h.v6k2o3roarzz].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9208) Add support for mapping columns to pubsub message attributes in flat schemas DDL

2020-01-28 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-9208:
---

 Summary: Add support for mapping columns to pubsub message 
attributes in flat schemas DDL
 Key: BEAM-9208
 URL: https://issues.apache.org/jira/browse/BEAM-9208
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Brian Hulette
Assignee: Brian Hulette


Context: 
https://lists.apache.org/thread.html/bf4c37f21bda194d7f8c40f6e7b9a776262415755cc1658412af3c76%40%3Cdev.beam.apache.org%3E

The syntax should look something like this (proposed by [~alexvanboxel]):
{{
CREATE TABLE people (
my_timestamp TIMESTAMP *OPTION(ref="pubsub:event_timestamp)*,
my_id VARCHAR *OPTION(ref="pubsub:attributes['id_name']")*,
name VARCHAR,
age INTEGER
  )
  TYPE 'pubsub'
  LOCATION 'projects/my-project/topics/my-topic'
}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9132) State request handler is removed prematurely when closing ActiveBundle

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9132?focusedWorklogId=378417&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378417
 ]

ASF GitHub Bot logged work on BEAM-9132:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:18
Start Date: 28/Jan/20 19:18
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #10694: [BEAM-9132] 
Avoid logging misleading error messages during pipeline failure
URL: https://github.com/apache/beam/pull/10694#discussion_r372004299
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java
 ##
 @@ -255,6 +255,14 @@ public void close() throws Exception {
 // Clear the cache. This closes all active environments.
 // note this may cause open calls to be cancelled by the peer
 for (LoadingCache environmentCache : 
environmentCaches) {
+  for (WrappedSdkHarnessClient client : environmentCache.asMap().values()) 
{
+try {
+  client.close();
 
 Review comment:
   Worth mentioning that this is added to close the environments irrespective 
of open bundles, since this will occur only during shutdown?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378417)
Time Spent: 2h  (was: 1h 50m)

> State request handler is removed prematurely when closing ActiveBundle
> --
>
> Key: BEAM-9132
> URL: https://issues.apache.org/jira/browse/BEAM-9132
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We have observed these errors in a state-intense application: 
> {noformat}
> Error processing instruction 107. Original traceback is
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 780, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 587, in 
> apache_beam.runners.common.PerWindowInvoker.invoke_process
>   File "apache_beam/runners/common.py", line 659, in 
> apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
>   File "apache_beam/runners/common.py", line 880, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "apache_beam/runners/common.py", line 895, in 
> apache_beam.runners.common._OutputProcessor.process_outputs
>   File "redacted.py", line 56, in process
> recent_events_map = load_recent_events_map(recent_events_state)
>   File "redacted.py", line 128, in _load_recent_events_map
> items_in_recent_events_bag = list(recent_events_state.read())
>   File "apache_beam/runners/worker/bundle_processor.py", line 335, in __iter__
> for elem in self.first:
>   File "apache_beam/runners/worker/bundle_processor.py", line 214, in __iter__
> self._state_key, self._coder_impl, is_cached=self._is_cached)
>   File "apache_beam/runners/worker/sdk_worker.py", line 692, in blocking_get
> self._materialize_iter(state_key, coder))
>   File "apache_beam/runners/worker/sdk_worker.py", line 723, in 
> _materialize_iter
> self._underlying.get_raw(state_key, continuation_token)
>   File "apache_beam/runners/worker/sdk_worker.py", line 603, in get_raw
> continuation_token=continuation_token)))
>   File "apache_beam/runners/worker/sdk_worker.py", line 637, in 
> _blocking_request
> raise RuntimeError(response.error)
> RuntimeError: Unknown process bundle instruction id '107'
> {noformat}
> Notice that the error is thrown on the Runner side. It seems to originate 
> from the {{ActiveBundle}} de-registering the state request handler too early 
> when the processing may still be going on in the SDK Harness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6703) Support Java 11 in Jenkins

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6703?focusedWorklogId=378418&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378418
 ]

ASF GitHub Bot logged work on BEAM-6703:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:18
Start Date: 28/Jan/20 19:18
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #10689: [BEAM-6703] 
Make Dataflow ValidatesRunner test use Java 11 in test execution
URL: https://github.com/apache/beam/pull/10689#discussion_r372004381
 
 

 ##
 File path: 
.test-infra/jenkins/job_PostCommit_Java_ValidatesRunner_Dataflow_Java11.groovy
 ##
 @@ -20,26 +20,40 @@ import CommonJobProperties as commonJobProperties
 import PostcommitJobBuilder
 
 
-PostcommitJobBuilder.postCommitJob('beam_PostCommit_Java11_ValidatesRunner_Dataflow',
+PostcommitJobBuilder.postCommitJob('beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11',
   'Run Dataflow ValidatesRunner Java 11', 'Google Cloud Dataflow Runner 
ValidatesRunner Tests On Java 11', this) {
 
 Review comment:
   I suggest use trigger phrase
   "Run beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11"
   it is same in terms of being descriptive, but much easier to figure out.
   
   Or "Run Google Cloud Dataflow Runner ValidatesRunner Tests On Java 11".
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378418)
Time Spent: 16.5h  (was: 16h 20m)

> Support Java 11 in Jenkins
> --
>
> Key: BEAM-6703
> URL: https://issues.apache.org/jira/browse/BEAM-6703
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, runner-direct
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 16.5h
>  Remaining Estimate: 0h
>
> In this issue I'll create a Jenkins job that compiles Dataflow and Direct 
> runners with tests using Java 8 and runs Validates Runner suites with Java 11 
> Runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9209) Add support for mapping columns to pubsub message event_timestamp when using flat schemas DDL

2020-01-28 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-9209:
---

 Summary: Add support for mapping columns to pubsub message 
event_timestamp when using flat schemas DDL
 Key: BEAM-9209
 URL: https://issues.apache.org/jira/browse/BEAM-9209
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Brian Hulette
Assignee: Brian Hulette


Context: 
https://lists.apache.org/thread.html/bf4c37f21bda194d7f8c40f6e7b9a776262415755cc1658412af3c76%40%3Cdev.beam.apache.org%3E

The syntax should look something like this (proposed by [~alexvanboxel]):
{code}
CREATE TABLE people (
my_timestamp TIMESTAMP *OPTION(ref="pubsub:event_timestamp)*,
my_id VARCHAR *OPTION(ref="pubsub:attributes['id_name']")*,
name VARCHAR,
age INTEGER
  )
  TYPE 'pubsub'
  LOCATION 'projects/my-project/topics/my-topic'
{code}

This jira pertains specifically to the timestamp portion of this syntax.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-6703) Support Java 11 in Jenkins

2020-01-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6703?focusedWorklogId=378420&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-378420
 ]

ASF GitHub Bot logged work on BEAM-6703:


Author: ASF GitHub Bot
Created on: 28/Jan/20 19:20
Start Date: 28/Jan/20 19:20
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #10689: [BEAM-6703] Make 
Dataflow ValidatesRunner test use Java 11 in test execution
URL: https://github.com/apache/beam/pull/10689#issuecomment-579411180
 
 
   R: @markflyhigh, @lukecwik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 378420)
Time Spent: 16h 40m  (was: 16.5h)

> Support Java 11 in Jenkins
> --
>
> Key: BEAM-6703
> URL: https://issues.apache.org/jira/browse/BEAM-6703
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow, runner-direct
>Reporter: Michał Walenia
>Assignee: Michał Walenia
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 16h 40m
>  Remaining Estimate: 0h
>
> In this issue I'll create a Jenkins job that compiles Dataflow and Direct 
> runners with tests using Java 8 and runs Validates Runner suites with Java 11 
> Runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >