[jira] [Created] (BEAM-7666) Pipe memory thrashing signal to Dataflow

2019-07-01 Thread Dustin Rhodes (JIRA)
Dustin Rhodes created BEAM-7666:
---

 Summary: Pipe memory thrashing signal to Dataflow
 Key: BEAM-7666
 URL: https://issues.apache.org/jira/browse/BEAM-7666
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Dustin Rhodes


For autoscaling we would like to know if the user worker is spending too much 
time garbage collecting.  Pipe this signal through counters to DF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7666) Pipe memory thrashing signal to Dataflow

2019-07-01 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes updated BEAM-7666:

Priority: Critical  (was: Major)

> Pipe memory thrashing signal to Dataflow
> 
>
> Key: BEAM-7666
> URL: https://issues.apache.org/jira/browse/BEAM-7666
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Dustin Rhodes
>Priority: Critical
>
> For autoscaling we would like to know if the user worker is spending too much 
> time garbage collecting.  Pipe this signal through counters to DF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7666) Pipe memory thrashing signal to Dataflow

2019-07-01 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes reassigned BEAM-7666:
---

Assignee: Dustin Rhodes

> Pipe memory thrashing signal to Dataflow
> 
>
> Key: BEAM-7666
> URL: https://issues.apache.org/jira/browse/BEAM-7666
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Critical
>
> For autoscaling we would like to know if the user worker is spending too much 
> time garbage collecting.  Pipe this signal through counters to DF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-6540) Autoscaling should be aware of Streaming RPC Quota

2019-02-08 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes closed BEAM-6540.
---
Resolution: Fixed

>  Autoscaling should be aware of Streaming RPC Quota
> ---
>
> Key: BEAM-6540
> URL: https://issues.apache.org/jira/browse/BEAM-6540
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Dustin Rhodes
>Assignee: Tyler Akidau
>Priority: Major
>  Labels: triaged
> Fix For: 2.11.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Streaming Windmill Service introduces quota for the shared windmill workers.  
> Autoscaling needs to be aware of throttling due to this quota in order to not 
> upscale.  This PR adds in that reporting.
>  
> It also introduces the flag --EnableStreamingEngine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-6571) Flag for streaming engine

2019-02-08 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes closed BEAM-6571.
---
   Resolution: Fixed
Fix Version/s: 2.11.0

> Flag for streaming engine
> -
>
> Key: BEAM-6571
> URL: https://issues.apache.org/jira/browse/BEAM-6571
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Major
>  Labels: triaged
> Fix For: 2.11.0
>
>
> Adds the --enableStreamingEngine for Java and Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-6561) WorkProgressUpdater synchronizes on java.util.concurrent classes

2019-02-01 Thread Dustin Rhodes (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758558#comment-16758558
 ] 

Dustin Rhodes edited comment on BEAM-6561 at 2/1/19 6:51 PM:
-

whoops meant to put this on 6562

 


was (Author: dustin12):
I believe this is because there is synchronization on batches, a 
ConcurrentLinkedDeque, to enforce the atomicity of a sequence of method calls 
on it.  As far as I'm aware this pattern is error prone (or at least confusing) 
as ConcurrentLinkedDeque does not make any guarantees about using itself as a 
lock for its internal operations so there is not guarantee that the 
synchronized blocks run atomically.

Is best practice here to switch it to a standard LinkedDeque and do all the 
synchronization on it manually so that we are sure the synchronized blocks are 
atomic?  Is there an easy way (a gradle command) I can run Findbugs to confirm 
that this is the bug its finding (although I'm pretty sure)?

> WorkProgressUpdater synchronizes on java.util.concurrent classes
> 
>
> Key: BEAM-6561
> URL: https://issues.apache.org/jira/browse/BEAM-6561
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Priority: Major
>
> Findbugs caught this. There seems to be synchronization on variables where 
> the classes themselves have their own mechanisms. If intended, it should be 
> made more clear by simple mutexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6562) GrpcWindmillServer synchronizes on java.util.concurrent classes

2019-02-01 Thread Dustin Rhodes (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758573#comment-16758573
 ] 

Dustin Rhodes commented on BEAM-6562:
-

I believe this is because there is synchronization on batches, a 
ConcurrentLinkedDeque, to enforce the atomicity of a sequence of method calls 
on it.  As far as I'm aware this pattern is error prone (or at least confusing) 
as ConcurrentLinkedDeque does not make any guarantees about using itself as a 
lock for its internal operations so there is not guarantee that the 
synchronized blocks run atomically.

Is best practice here to switch it to a standard LinkedDeque and do all the 
synchronization on it manually so that we are sure the synchronized blocks are 
atomic?  Is there an easy way (a gradle command) I can run Findbugs to confirm 
that this is the bug its finding (although I'm pretty sure)?

> GrpcWindmillServer synchronizes on java.util.concurrent classes
> ---
>
> Key: BEAM-6562
> URL: https://issues.apache.org/jira/browse/BEAM-6562
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Priority: Major
>
> Findbugs caught this. There seems to be synchronization on variables where 
> the classes themselves have their own mechanisms. If intended, it should be 
> made more clear by simple mutexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6561) WorkProgressUpdater synchronizes on java.util.concurrent classes

2019-02-01 Thread Dustin Rhodes (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758558#comment-16758558
 ] 

Dustin Rhodes commented on BEAM-6561:
-

I believe this is because there is synchronization on batches, a 
ConcurrentLinkedDeque, to enforce the atomicity of a sequence of method calls 
on it.  As far as I'm aware this pattern is error prone (or at least confusing) 
as ConcurrentLinkedDeque does not make any guarantees about using itself as a 
lock for its internal operations so there is not guarantee that the 
synchronized blocks run atomically.

Is best practice here to switch it to a standard LinkedDeque and do all the 
synchronization on it manually so that we are sure the synchronized blocks are 
atomic?  Is there an easy way (a gradle command) I can run Findbugs to confirm 
that this is the bug its finding (although I'm pretty sure)?

> WorkProgressUpdater synchronizes on java.util.concurrent classes
> 
>
> Key: BEAM-6561
> URL: https://issues.apache.org/jira/browse/BEAM-6561
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Priority: Major
>
> Findbugs caught this. There seems to be synchronization on variables where 
> the classes themselves have their own mechanisms. If intended, it should be 
> made more clear by simple mutexes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6571) Flag for streaming engine

2019-01-31 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes reassigned BEAM-6571:
---

Assignee: Dustin Rhodes  (was: Tyler Akidau)

> Flag for streaming engine
> -
>
> Key: BEAM-6571
> URL: https://issues.apache.org/jira/browse/BEAM-6571
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Major
>
> Adds the --enableStreamingEngine for Java and Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6571) Flag for streaming engine

2019-01-31 Thread Dustin Rhodes (JIRA)
Dustin Rhodes created BEAM-6571:
---

 Summary: Flag for streaming engine
 Key: BEAM-6571
 URL: https://issues.apache.org/jira/browse/BEAM-6571
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Dustin Rhodes
Assignee: Tyler Akidau


Adds the --enableStreamingEngine for Java and Python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-6190) "Processing stuck" messages should be visible on Pantheon

2019-01-29 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes closed BEAM-6190.
---

Verified it is now reported in prod.

> "Processing stuck" messages should be visible on Pantheon
> -
>
> Key: BEAM-6190
> URL: https://issues.apache.org/jira/browse/BEAM-6190
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: 2.8.0
> Environment: Running on Google Cloud Dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Minor
> Fix For: Not applicable
>
>   Original Estimate: 24h
>  Time Spent: 1h 40m
>  Remaining Estimate: 22h 20m
>
> When user processing results in an exception, it is clearly visible on the 
> Pantheon landing page for a streaming Dataflow job. But when user processing 
> becomes stuck, there is no indication, even though the worker logs it. Most 
> users don't check worker logs and it is not that convenient to check for most 
> users.  Ideally a stuck worker would result in a visible error on the 
> Pantheon landing page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6190) "Processing stuck" messages should be visible on Pantheon

2019-01-29 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes resolved BEAM-6190.
-
Resolution: Fixed

> "Processing stuck" messages should be visible on Pantheon
> -
>
> Key: BEAM-6190
> URL: https://issues.apache.org/jira/browse/BEAM-6190
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: 2.8.0
> Environment: Running on Google Cloud Dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Minor
> Fix For: Not applicable
>
>   Original Estimate: 24h
>  Time Spent: 1h 40m
>  Remaining Estimate: 22h 20m
>
> When user processing results in an exception, it is clearly visible on the 
> Pantheon landing page for a streaming Dataflow job. But when user processing 
> becomes stuck, there is no indication, even though the worker logs it. Most 
> users don't check worker logs and it is not that convenient to check for most 
> users.  Ideally a stuck worker would result in a visible error on the 
> Pantheon landing page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6540) Autoscaling should be aware of Streaming RPC Quota

2019-01-29 Thread Dustin Rhodes (JIRA)
Dustin Rhodes created BEAM-6540:
---

 Summary:  Autoscaling should be aware of Streaming RPC Quota
 Key: BEAM-6540
 URL: https://issues.apache.org/jira/browse/BEAM-6540
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Affects Versions: 2.11.0
Reporter: Dustin Rhodes
Assignee: Tyler Akidau
 Fix For: 2.11.0


Streaming Windmill Service introduces quota for the shared windmill workers.  
Autoscaling needs to be aware of throttling due to this quota in order to not 
upscale.  This PR adds in that reporting.

 

It also introduces the flag --EnableStreamingEngine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-6190) "Processing stuck" messages should be visible on Pantheon

2018-12-06 Thread Dustin Rhodes (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dustin Rhodes reassigned BEAM-6190:
---

Assignee: Dustin Rhodes  (was: Tyler Akidau)

> "Processing stuck" messages should be visible on Pantheon
> -
>
> Key: BEAM-6190
> URL: https://issues.apache.org/jira/browse/BEAM-6190
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Affects Versions: 2.8.0
> Environment: Running on Google Cloud Dataflow
>Reporter: Dustin Rhodes
>Assignee: Dustin Rhodes
>Priority: Minor
> Fix For: Not applicable
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When user processing results in an exception, it is clearly visible on the 
> Pantheon landing page for a streaming Dataflow job. But when user processing 
> becomes stuck, there is no indication, even though the worker logs it. Most 
> users don't check worker logs and it is not that convenient to check for most 
> users.  Ideally a stuck worker would result in a visible error on the 
> Pantheon landing page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6190) "Processing stuck" messages should be visible on Pantheon

2018-12-06 Thread Dustin Rhodes (JIRA)
Dustin Rhodes created BEAM-6190:
---

 Summary: "Processing stuck" messages should be visible on Pantheon
 Key: BEAM-6190
 URL: https://issues.apache.org/jira/browse/BEAM-6190
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Affects Versions: 2.8.0
 Environment: Running on Google Cloud Dataflow
Reporter: Dustin Rhodes
Assignee: Tyler Akidau
 Fix For: Not applicable


When user processing results in an exception, it is clearly visible on the 
Pantheon landing page for a streaming Dataflow job. But when user processing 
becomes stuck, there is no indication, even though the worker logs it. Most 
users don't check worker logs and it is not that convenient to check for most 
users.  Ideally a stuck worker would result in a visible error on the Pantheon 
landing page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)