Re: [DISCUSS] Identifiers with multi-catalog support

2019-01-22 Thread Ryan Blue
heah > > > > *From: *Felix Cheung > *Date: *Sunday, January 20, 2019 at 4:48 PM > *To: *"rb...@netflix.com" , Spark Dev List < > dev@spark.apache.org> > *Subject: *Re: [DISCUSS] Identifiers with multi-catalog support > > > > +1 I like Rya

Re: [DISCUSS] Identifiers with multi-catalog support

2019-01-22 Thread Matt Cheah
ed to play nice with column identifier as well. From: Ryan Blue Sent: Thursday, January 17, 2019 11:38 AM To: Spark Dev List Subject: Re: [DISCUSS] Identifiers with multi-catalog support Any discussion on how Spark should manage identifiers when multiple catalogs are supported?

Re: [DISCUSS] Identifiers with multi-catalog support

2019-01-20 Thread Felix Cheung
identifier as well. From: Ryan Blue Sent: Thursday, January 17, 2019 11:38 AM To: Spark Dev List Subject: Re: [DISCUSS] Identifiers with multi-catalog support Any discussion on how Spark should manage identifiers when multiple catalogs are supported? I know

Re: [DISCUSS] Identifiers with multi-catalog support

2019-01-17 Thread Ryan Blue
Any discussion on how Spark should manage identifiers when multiple catalogs are supported? I know this is an area where a lot of people are interested in making progress, and it is a blocker for both multi-catalog support and CTAS in DSv2. On Sun, Jan 13, 2019 at 2:22 PM Ryan Blue wrote: > I

Re: [DISCUSS] Identifiers with multi-catalog support

2019-01-13 Thread Ryan Blue
I think that the solution to this problem is to mix the two approaches by supporting 3 identifier parts: catalog, namespace, and name, where namespace can be an n-part identifier: type Namespace = Seq[String] case class CatalogIdentifier(space: Namespace, name: String) This allows catalogs to

Re: [DISCUSS] Identifiers with multi-catalog support

2019-01-13 Thread Reynold Xin
Thanks for writing this up. Just to show why option 1 is not sufficient. MySQL and Postgres are the two most popular open source database systems, and both support database → schema → table 3 part identification, so Spark supporting only 2 part name passing to the data source (option 1) isn't

[DISCUSS] Identifiers with multi-catalog support

2019-01-13 Thread Ryan Blue
In the DSv2 sync up, we tried to discuss the Table metadata proposal but were side-tracked on its use of TableIdentifier. There were good points about how Spark should identify tables, views, functions, etc, and I want to start a discussion here. Identifiers are orthogonal to the TableCatalog