[jira] [Commented] (KYLIN-1351) Support common RDBMS as data source in Kylin

Luke Han (JIRA) Tue, 02 Feb 2016 22:53:30 -0800

    [ 
https://issues.apache.org/jira/browse/KYLIN-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129917#comment-15129917
 ]


Luke Han commented on KYLIN-1351:
---------------------------------

Copy from mailing list and link old JIRA.

----------------
Hi Edward, 
     Thanks to raise this discussion, read data from RDBMs is tricky and we 
have to come up a very clear design and architecture before implement it.

     There's one thread/JIRA about read data from Oracle directly, but finally 
dropped this since there's already many tools could handle it, extract data 
from Oracle and load to Hive.

     The concern here is, most RDBMs are not optimized yet for distribution 
system to read directly. For example, hundreds Hadoop nodes read data from 
MySQL or Oracle or others directly. And also network.

     From the beginning, we decided to use Hive as protocol between upstream 
and Kylin. This is good model so far since users could leverage every ETL tool 
to do this job, to landing source data into Hive and then build cube based on 
it. Even if Kylin supports to read data from RDBMs, then how about transform? 
how about load? it will bring ETL parts into Kylin's scope which is not good 
idea, I think.
     
      But read from RDBMs is valid to extend input source rather than Hive 
today, not only RDBMs also SparkSQL, Impala, Drill and other SQL on Hadoop. 
      How about to build a light tool for this requirement? Which could be one 
extension tool for user to leverage.

      Thanks.
Luke


> Support common RDBMS as data source in Kylin
> --------------------------------------------
>
>                 Key: KYLIN-1351
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1351
>             Project: Kylin
>          Issue Type: New Feature
>            Reporter: Shaofeng SHI
>            Assignee: Edward Zhang
>              Labels: newbie
>
> From v2.0, Kylin's plug-in architecture makes it possible to have multiple 
> data sources, cube engines and storages. Some users ever aksed that whether 
> Kylin support source data feeded from RDBMS like Oracle, MySQL, now it is 
> possible to do that. Some tools like Apache Sqoop can easily export data from 
> RDBMS to HDFS, that would help Kylin get the data and then build that into 
> cubes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KYLIN-1351) Support common RDBMS as data source in Kylin

Reply via email to