[ 
https://issues.apache.org/jira/browse/FLUME-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15359170#comment-15359170
 ] 

Attila Simon commented on FLUME-2938:
-------------------------------------

As it turned out I'm a big fan of the "do one thing well" design. My concern is 
duplicating work should be avoided so would be good to know what would be the 
additional functionality. 

On the other hand flume is for streaming data and the source you mentioned 
should have a scheduler. Is this really a functionality flume should provide? 
Or only a little tweak in Sqoop is required if there is any at all.

> JDBC Source
> -----------
>
>                 Key: FLUME-2938
>                 URL: https://issues.apache.org/jira/browse/FLUME-2938
>             Project: Flume
>          Issue Type: New Feature
>          Components: Sinks+Sources
>    Affects Versions: v1.8.0
>            Reporter: Lior Zeno
>             Fix For: v1.8.0
>
>
> The idea is to allow migrating data from SQL stores to NoSQL stores or HDFS 
> for archiving purposes.
> This source will get a statement to execute and a scheduling policy. It will 
> be able to fetch timestamped data by performing range queries on a 
> configurable field (this can fetch data with incremental id as well). For 
> fault-tolerance, the last fetched value can be checkpointed to a file.
> Dealing with large datasets can be done via the fetch_size parameter. (Ref: 
> https://docs.oracle.com/cd/A87860_01/doc/java.817/a83724/resltse5.htm)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to