zkytech commented on PR #4423:
URL: https://github.com/apache/zeppelin/pull/4423#issuecomment-1193082009
>
To make it easier to join between different database. For example, join
between `mongodb` & `hive` & `mysql`, assuming that:
- hive has a user_info table with fields: `user_name`, `occupation_code`,
`region_code`
- mongo has a occupation_info table with fields: `occupation_code`,
`occupation_name`
- mysql has a region_info table with fields: `region_code`, `region_name`
I want to get data with these fields: `user_name`, `occupation_name`,
`region_name`
Without cross datasource query , I need to write spark scala code to load
mongo/mysql table to Spark DataFrame, it is a hard job to do this every time.
With cross datasource query, I can easily take these fields from mongodb and
mysql with only one sql query.
```sql
select
t1.user_name, t2.occupation_name, t3.region_name
from
hive_db.user_info as t1
left join
mongodb.user_db.occupation_info as t2
on
t1.occupation_code = t2.occupation_code
left join
mysql.another_userinfo_db.region_info as t3
on
t1.region_code = t3.region_code
```
This is a lightweight replacement for presto, easier to use for zeppelin
users .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]