[jira] [Commented] (KAFKA-6455) Improve timestamp propagation at DSL level

2019-05-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838180#comment-16838180
 ] 

ASF GitHub Bot commented on KAFKA-6455:
---

guozhangwang commented on pull request #6645: KAFKA-6455: Session Aggregation 
should use window-end-time as record timestamp
URL: https://github.com/apache/kafka/pull/6645
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve timestamp propagation at DSL level
> --
>
> Key: KAFKA-6455
> URL: https://issues.apache.org/jira/browse/KAFKA-6455
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 1.0.0
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Major
>  Labels: needs-kip
>
> At DSL level, we inherit the timestamp propagation "contract" from the 
> Processor API. This contract in not optimal at DSL level, and we should 
> define a DSL level contract that matches the semantics of the corresponding 
> DSL operator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7934) Optimize restore for windowed and session stores

2019-05-12 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838094#comment-16838094
 ] 

Matthias J. Sax commented on KAFKA-7934:


I don't thinks it's necessary to find the exact timestamp. 
`offsetForTimestamps` gives us a lower bound and we can just start reading from 
there.

> Optimize restore for windowed and session stores
> 
>
> Key: KAFKA-7934
> URL: https://issues.apache.org/jira/browse/KAFKA-7934
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>
> During state restore of window/session stores, the changelog topic is scanned 
> from the oldest entries to the newest entry. This happen on a 
> record-per-record basis or in record batches.
> During this process, new segments are created while time advances (base on 
> the record timestamp of the record that are restored). However, depending on 
> the retention time, we might expire segments during restore process later 
> again. This is wasteful. Because retention time is based on the largest 
> timestamp per partition, it is possible to compute a bound for live and 
> expired segment upfront (assuming that we know the largest timestamp). This 
> way, during restore, we could avoid creating segments that are expired later 
> anyway and skip over all corresponding records.
> The problem is, that we don't know the largest timestamp per partition. Maybe 
> the broker timestamp index could help to provide an approximation for this 
> value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-6521) Store record timestamps in KTable stores

2019-05-12 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-6521.

   Resolution: Fixed
Fix Version/s: 2.3.0

Part of KIP-258: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-258%3A+Allow+to+Store+Record+Timestamps+in+RocksDB]

> Store record timestamps in KTable stores
> 
>
> Key: KAFKA-6521
> URL: https://issues.apache.org/jira/browse/KAFKA-6521
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Major
> Fix For: 2.3.0
>
>
> Currently, KTables store plain key-value pairs. However, it is desirable to 
> also store a timestamp for the record.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6521) Store record timestamps in KTable stores

2019-05-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838032#comment-16838032
 ] 

ASF GitHub Bot commented on KAFKA-6521:
---

mjsax commented on pull request #6667: KAFKA-6521: Use timestamped stores for 
KTables
URL: https://github.com/apache/kafka/pull/6667
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Store record timestamps in KTable stores
> 
>
> Key: KAFKA-6521
> URL: https://issues.apache.org/jira/browse/KAFKA-6521
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Major
>
> Currently, KTables store plain key-value pairs. However, it is desirable to 
> also store a timestamp for the record.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8315) Cannot pass Materialized into a join operation - hence cant set retention period independent of grace

2019-05-12 Thread Andrew (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838012#comment-16838012
 ] 

Andrew commented on KAFKA-8315:
---

[~ableegoldman] Right, I see what you mean, this loop only goes until the head 
is found 
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/RecordQueue.java#L158.]
 We do have out of order data in our real data streams, however, it looks like 
you are right that it shouldn't affect my {{join-example}} demo, which 
reproduces the issue with only ordered data. Any further ideas on why the 
{{join-example}} doesnt work?

> Cannot pass Materialized into a join operation - hence cant set retention 
> period independent of grace
> -
>
> Key: KAFKA-8315
> URL: https://issues.apache.org/jira/browse/KAFKA-8315
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Andrew
>Assignee: John Roesler
>Priority: Major
> Attachments: code.java
>
>
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> ([https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-]),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
>  
> Slack conversation here : 
> [https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300]
> [Additional]
> From what I understand, the retention period should be independent of the 
> grace period, so I think this is more than a documentation fix (see comments 
> below)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)