[jira] [Commented] (DRILL-5977) predicate pushdown support kafkaMsgOffset

Abhishek Ravi (JIRA) Sun, 01 Apr 2018 22:43:27 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421951#comment-16421951
 ]


Abhishek Ravi commented on DRILL-5977:
--------------------------------------

Thank you for review [~akumarb2010]. Yes, you are absolutely right. As an 
initial approach to tackle this problem I plan to do the following after 
obtaining *top-level predicates* in an expression.
 # Check if condition on {{kafkaMsgTimestamp}} / {{kafkaMsgOffset exists.}}
 # Check if there is no {{OR}}  joining top-level predicates.

Do filter pushdown only when both checks succeed. Does  this sound good?

> predicate pushdown support kafkaMsgOffset
> -----------------------------------------
>
>                 Key: DRILL-5977
>                 URL: https://issues.apache.org/jira/browse/DRILL-5977
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: B Anil Kumar
>            Assignee: Bhallamudi Venkata Siva Kamesh
>            Priority: Major
>             Fix For: 1.14.0
>
>
> As part of Kafka storage plugin review, below is the suggestion from Paul.
> {noformat}
> Does it make sense to provide a way to select a range of messages: a starting 
> point or a count? Perhaps I want to run my query every five minutes, scanning 
> only those messages since the previous scan. Or, I want to limit my take to, 
> say, the next 1000 messages. Could we use a pseudo-column such as 
> "kafkaMsgOffset" for that purpose? Maybe
> SELECT * FROM <some topic> WHERE kafkaMsgOffset > 12345
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-5977) predicate pushdown support kafkaMsgOffset

Reply via email to