Spark SQL Table Name Resolution

Stuart Macdonald Tue, 07 Aug 2018 09:32:41 -0700

Hello Igniters,

The Ignite Spark SQL interface currently takes just “table name” as a
parameter which it uses to supply a Spark dataset with data from the
underlying Ignite SQL table with that name.


To do this it loops through each cache and finds the first one with the
given table name [1]. This causes issues if there are multiple tables
registered in different caches with the same table name as you can only
access one of those caches from Spark. Is the right thing to do here:

1. Simply not support such a scenario and note in the Spark documentation
that table names must be unique?
2. Pass an extra parameter through the Ignite Spark data source which
optionally specifies the cache name?
3. Support namespacing in the existing table name parameter, ie
“cacheName.tableName”?

Thanks,
Stuart.

[1]
https://github.com/apache/ignite/blob/ca973ad99c6112160a305df05be9458e29f88307/modules/spark/src/main/scala/org/apache/ignite/spark/impl/package.scala#L119

Spark SQL Table Name Resolution

Reply via email to