[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-02-18 Thread Apurva Mehta (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta updated KAFKA-9430:

Priority: Major  (was: Blocker)

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Right now, we use _endOffsets_ of the source topic for the computation. For 
> "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
> over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.
> The current implementation is technically correct, within the documented 
> behavior that the result is an "estimate", but I marked it as a blocker to be 
> sure that we revisit it after ongoing work to refactor the task management in 
> Streams is complete. If it becomes straightforward to tighten up the 
> estimate, we should go ahead and do it. Otherwise, we can downgrade the 
> priority of the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-9430:

Parent: (was: KAFKA-6144)
Issue Type: Improvement  (was: Sub-task)

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Vinoth Chandar
>Priority: Blocker
>
> Right now, we use _endOffsets_ of the source topic for the computation. For 
> "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
> over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.
> The current implementation is technically correct, within the documented 
> behavior that the result is an "estimate", but I marked it as a blocker to be 
> sure that we revisit it after ongoing work to refactor the task management in 
> Streams is complete. If it becomes straightforward to tighten up the 
> estimate, we should go ahead and do it. Otherwise, we can downgrade the 
> priority of the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-9430:

Description: 
Right now, we use _endOffsets_ of the source topic for the computation. For 
"optimized" changelogs, this will be wrong, strictly speaking, but it's an 
over-estimate (which seems better than an under-estimate), and it's also still 
an apples-to-apples comparison, since all replicas would use the same upper 
bound to compute their lags, so the "pick the freshest" replica is still going 
to pick the right one.

The current implementation is technically correct, within the documented 
behavior that the result is an "estimate", but I marked it as a blocker to be 
sure that we revisit it after ongoing work to refactor the task management in 
Streams is complete. If it becomes straightforward to tighten up the estimate, 
we should go ahead and do it. Otherwise, we can downgrade the priority of the 
ticket.

  was:
Right now, we use _endOffsets_ of the source topic for the computation. Since 
the source topics can also have user event produces, this is an over estimate

 

>From John:

For "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
over-estimate (which seems better than an under-estimate), and it's also still 
an apples-to-apples comparison, since all replicas would use the same upper 
bound to compute their lags, so the "pick the freshest" replica is still going 
to pick the right one.


> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
>
> Right now, we use _endOffsets_ of the source topic for the computation. For 
> "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
> over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.
> The current implementation is technically correct, within the documented 
> behavior that the result is an "estimate", but I marked it as a blocker to be 
> sure that we revisit it after ongoing work to refactor the task management in 
> Streams is complete. If it becomes straightforward to tighten up the 
> estimate, we should go ahead and do it. Otherwise, we can downgrade the 
> priority of the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-9430:

Priority: Blocker  (was: Major)

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
>
> Right now, we use _endOffsets_ of the source topic for the computation. Since 
> the source topics can also have user event produces, this is an over estimate
>  
> From John:
> For "optimized" changelogs, this will be wrong, strictly speaking, but it's 
> an over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-9430:

Affects Version/s: 2.5.0

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 2.5.0
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Right now, we use _endOffsets_ of the source topic for the computation. Since 
> the source topics can also have user event produces, this is an over estimate
>  
> From John:
> For "optimized" changelogs, this will be wrong, strictly speaking, but it's 
> an over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-9430:

Fix Version/s: (was: 2.5.0)

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
>
> Right now, we use _endOffsets_ of the source topic for the computation. Since 
> the source topics can also have user event produces, this is an over estimate
>  
> From John:
> For "optimized" changelogs, this will be wrong, strictly speaking, but it's 
> an over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-16 Thread John Roesler (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler updated KAFKA-9430:

Description: 
Right now, we use _endOffsets_ of the source topic for the computation. Since 
the source topics can also have user event produces, this is an over estimate

 

>From John:

For "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
over-estimate (which seems better than an under-estimate), and it's also still 
an apples-to-apples comparison, since all replicas would use the same upper 
bound to compute their lags, so the "pick the freshest" replica is still going 
to pick the right one.

  was:
Right now, we use _endOffsets_ of the source topic for the computation. Since 
the source topics can also have user event produces, this is an over estimate

 

>From John:

For "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
over-estimate (which seems better than an under-estimate), and it's also still 
an apples-to-apples comparison, since all replicas would use the same upper 
bound to compute their lags, so the "pick the freshest" replica is still going 
to pick the right one. We can add a new 2.5 blocker ticket to really fix it, 
and not worry about it until after this KSQL stuff is done.

 

For active: we need to use  consumed offsets and not end of source topic


> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
> Fix For: 2.5.0
>
>
> Right now, we use _endOffsets_ of the source topic for the computation. Since 
> the source topics can also have user event produces, this is an over estimate
>  
> From John:
> For "optimized" changelogs, this will be wrong, strictly speaking, but it's 
> an over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-14 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated KAFKA-9430:
--
Component/s: streams

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
> Fix For: 2.5.0
>
>
> Right now, we use _endOffsets_ of the source topic for the computation. Since 
> the source topics can also have user event produces, this is an over estimate
>  
> From John:
> For "optimized" changelogs, this will be wrong, strictly speaking, but it's 
> an over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one. We can add a new 2.5 blocker ticket to 
> really fix it, and not worry about it until after this KSQL stuff is done.
>  
> For active: we need to use  consumed offsets and not end of source topic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

2020-01-14 Thread Vinoth Chandar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Chandar updated KAFKA-9430:
--
Fix Version/s: 2.5.0

> Tighten up lag estimates when source topic optimization is on 
> --
>
> Key: KAFKA-9430
> URL: https://issues.apache.org/jira/browse/KAFKA-9430
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Major
> Fix For: 2.5.0
>
>
> Right now, we use _endOffsets_ of the source topic for the computation. Since 
> the source topics can also have user event produces, this is an over estimate
>  
> From John:
> For "optimized" changelogs, this will be wrong, strictly speaking, but it's 
> an over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one. We can add a new 2.5 blocker ticket to 
> really fix it, and not worry about it until after this KSQL stuff is done.
>  
> For active: we need to use  consumed offsets and not end of source topic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)