zkytech commented on PR #4423:
URL: https://github.com/apache/zeppelin/pull/4423#issuecomment-1193082009

   > 
   To make it easier to join between different database. For example, join 
between `mongodb` & `hive` & `mysql`, assuming that:
   
   - hive has a user_info table with fields: `user_name`, `occupation_code`, 
`region_code`
   - mongo has a occupation_info table with fields: `occupation_code`, 
`occupation_name` 
   - mysql has a region_info table with fields: `region_code`, `region_name`
   
   I want to get data with these fields: `user_name`, `occupation_name`, 
`region_name`
   Without cross datasource query , I need to write spark scala code to load 
mongo/mysql table to Spark DataFrame, it is a hard job to do this every time.
   
   With cross datasource query, I can easily take these fields from mongodb and 
mysql with only one sql query.
   
   ```sql
   
   select
      t1.user_name, t2.occupation_name, t3.region_name
   from
      hive_db.user_info as t1
   left join
      mongodb.user_db.occupation_info as t2
   on
      t1.occupation_code = t2.occupation_code
   left join
      mysql.another_userinfo_db.region_info as t3
   on
     t1.region_code = t3.region_code
   ``` 
   This is a lightweight replacement for presto,  easier to use for zeppelin 
users .
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to