mach-kernel opened a new pull request, #1333:
URL: https://github.com/apache/datafusion-ballista/pull/1333

   ### Which issue does this PR close?
   TBD
   
   ### Rationale for this change
   A Ballista scheduler may have tables in its catalog that the client is 
unaware of. Currently, `show tables` only produces what is available in the 
client's local session context. This change allows the client to plan queries 
against schemas available in the Ballista cluster but not the client.
   
   ### What changes are included in this PR?
   
   ###### ballista
   - New protos to represent the hierarchy of catalog -> schema -> table and 
the custom table provider.
   - Scheduler RPC to pack the catalog names into protos
   - `SessionContextExt::remote*` calls scheduler and loads tables using 
`MemoryCatalogProvider`
   - A custom `RemoteTableProvider` that encapsulates names (catalog, schema, 
table), and a concrete Arrow schema
     - `RemoteTableProvider::scan` errors -- we should never try to actually 
scan it.
   - `BallistaLogicalExtensionCodec` treats `RemoteTableProvider` as a special 
case
     - Encode just packs into the table provider proto representation
     - Decode makes a concrete table reference and attempts to resolve it in 
the remote context
   
   ###### ballista-python
   - Add to workspace to consume new client changes
   
   ### Are there any user-facing changes?
   
   `ballista-cli` can now run queries against cluster tables!
   
   ```bash
   Ballista CLI v49.0.0
   ❯ show tables;
   +---------------+--------------------+---------------+------------+
   | table_catalog | table_schema       | table_name    | table_type |
   +---------------+--------------------+---------------+------------+
   | spice         | runtime            | task_history  | BASE TABLE |
   | spice         | public             | wiki_a_potion | BASE TABLE |
   | spice         | information_schema | tables        | VIEW       |
   | spice         | information_schema | views         | VIEW       |
   | spice         | information_schema | columns       | VIEW       |
   | spice         | information_schema | df_settings   | VIEW       |
   | spice         | information_schema | schemata      | VIEW       |
   | spice         | information_schema | routines      | VIEW       |
   | spice         | information_schema | parameters    | VIEW       |
   | datafusion    | information_schema | tables        | VIEW       |
   | datafusion    | information_schema | views         | VIEW       |
   | datafusion    | information_schema | columns       | VIEW       |
   | datafusion    | information_schema | df_settings   | VIEW       |
   | datafusion    | information_schema | schemata      | VIEW       |
   | datafusion    | information_schema | routines      | VIEW       |
   | datafusion    | information_schema | parameters    | VIEW       |
   +---------------+--------------------+---------------+------------+
   18446744073709551615 row(s) fetched.
   Elapsed 0.016 seconds.
   
   ❯ select id, title from spice.public.wiki_a_potion limit 2;
   +----------+----------+
   | id       | title    |
   +----------+----------+
   | 4549720  | Al Salvi |
   | 36963017 | Al Samha |
   +----------+----------+
   18446744073709551615 row(s) fetched.
   Elapsed 0.777 seconds.
   ```
   
   And with the Python client, in notebooks too!!
   
   <img width="1084" height="834" alt="image" 
src="https://github.com/user-attachments/assets/939e863e-a9e8-4f41-96c4-3386be9efc85";
 />
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to