[ 
https://issues.apache.org/jira/browse/KAFKA-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps resolved KAFKA-1403.
------------------------------
    Resolution: Won't Fix

> Adding timestamp to kafka index structure
> -----------------------------------------
>
>                 Key: KAFKA-1403
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1403
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 0.8.1
>            Reporter: Xinyao Hu
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Right now, kafka doesn't have timestamp per message. It makes an assumption 
> that all the messages in the same file has the same timestamp which is the 
> mtime of the file. This makes it inefficient to scan all the messages within 
> a time window, which is a valid use case in a lot of realtime data analysis. 
> One way to hack this is to roll a new file in a short period of time. 
> However, this will result in opening lots of files (KAFKA-1404) which crashed 
> the servers eventually. 
> My guess this is not implemented due to the efficiency reason. It will cost 
> additional four bytes per message which might be pinned in memory for fast 
> access. There might be some simple perf optimization, such as differential 
> encoding + var length encoding, which should bring down the cost to 1-2 bytes 
> avg per message. 
> Let me know if this makes sense. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to