Frank Varnavas created KAFKA-1339:
-------------------------------------
Summary: Time based offset retrieval seems broken
Key: KAFKA-1339
URL: https://issues.apache.org/jira/browse/KAFKA-1339
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 0.8.1
Environment: Linux
Reporter: Frank Varnavas
Priority: Minor
The kafka PartitionOffsetRequest takes a time parameter. It seems broken to me.
There are two magic values
-2 returns the oldest available offset
-1 returns the newest available offset
Otherwise the value is time since epoch in millisecs
(System.currentTimeMillis())
The granularity is limited to the granularity of the log files
These are the log segments for the partition I tested
Time now is about 17:07
Time shown is last modify time
File name has the starting offset number
You can see that the current one started about 13:40
1073742047 Mar 24 02:52 00000000000004740823.log
1073759588 Mar 24 11:25 00000000000004831581.log
1073782532 Mar 24 16:31 00000000000004916313.log
1073741985 Mar 25 09:11 00000000000005066939.log
1073743756 Mar 25 13:39 00000000000005158529.log
778424349 Mar 25 17:07 00000000000005214225.log
The below shows the returned offset for an input time = (current time - [0..23]
hours)
Even 1 second less than the current time returns the previous segment, even
though that segment ended 2.5 hours earlier.
I think the result is off by 1 log segment. i.e. offset 1-3 should have been
from 5214225, 4-7 should have been from 5158529
0 -> 5214225
1 -> 5158529
2 -> 5158529
3 -> 5158529
4 -> 5066939
5 -> 5066939
6 -> 5066939
7 -> 5066939
8 -> 4973490
9 -> 4973490
10 -> 4973490
11 -> 4973490
12 -> 4973490
13 -> 4973490
14 -> 4973490
15 -> 4973490
16 -> 4916313
17 -> 4916313
18 -> 4916313
19 -> 4916313
20 -> 4916313
21 -> 4916313
22 -> 4916313
23 -> 4916313
--
This message was sent by Atlassian JIRA
(v6.2#6252)