Adding dev@ back now. -Rui
On Wed, Aug 15, 2018 at 1:01 PM Andrew Pilloud <[email protected]> wrote: > Did we drop the dev list from this on purpose? (I haven't added it back, > but we probably should.) > > I'm in favor of sticking with the simple 'CREATE TABLE' and 'CREATE > SCHEMA' if there is only to be one option. Sticking with those names > minimizes both our deviation from other implementations and user surprise. > There is a lot of value in making sure our common operations closely map to > the equivalent common operations in other SQL dialects. I think users will > be more confused to find that 'CREATE TABLE' doesn't exist then to learn > that it might not always create a table. This minimizes the overhead of > learning our dialect of SQL and maximizes the odds that a user will be able > to guess at the syntax of something and have it work. (For example, a user > guessing at the syntax of CREATE TABLE would have a better experience with > the error being "field LOCATION not specified" rather than "operation > CREATE TABLE not found".) > > If the goal is clarity of the operation, how about 'REGISTER EXTERNAL DATA > SOURCE' and 'REGISTER EXTERNAL DATA SOURCE PROVIDER'? Those names remove > the ambiguity around the operation creating and the data source being a > table. > > Andrew > > On Wed, Aug 15, 2018 at 10:54 AM Anton Kedin <[email protected]> wrote: > >> My preference is to make `EXTERNAL` mandatory and only support `CREATE >> EXTERNAL TABLE` for existing semantics. My main reasons are: >> - user friendliness, matching expectations, readability. Current `CREATE >> TABLE` is basically a `CREATE EXTERNAL TABLE`. It is confusing to users >> familiar with SQL who expect that `CREATE TABLE` will actually create a >> table; >> - forward-compatibility. We could potentially support non-external >> `CREATE TABLE` at some point in the future, whatever semantics it might >> have. It will be wrong to use the same syntax for external and non-external >> CREATEs; >> >> I agree that typing extra word each time is not ideal, but my opinion is >> on the side that readability of code (including SQL) is important (how much >> time you spend reading / understanding code vs writing it) and we should >> try to improve it if we can. In case of DDL every non-trivial statement >> will already have a ton of unavoidable words (field names, types, location, >> options) so I would argue that adding extra one word would not noticeably >> reduce your happiness of writing it :) But it would improve readability and >> reduce ambiguity, which I think is worth it. >> >> I think that making it optional only introduces more confusion (e.g. >> what's the difference between the two DDL statements without reading the >> doc?) and would make situation worse. >> >> Regards, >> Anton >> >> >> >> >> On Wed, Aug 15, 2018 at 10:24 AM Mingmin Xu <[email protected]> wrote: >> >>> I prefer to `CREATE EXTERNAL TABLE`. My question is, do you plan to >>> support both `CREATE TABLE` and `CREATE EXTERNAL TABLE`, by making >>> `EXTERNAL` as optional? >>> >>> On Wed, Aug 15, 2018 at 10:01 AM, Andrew Pilloud <[email protected]> >>> wrote: >>> >>>> I think 'CREATE EXTERNAL TABLE' might make things a bit clearer from a >>>> documentation prospective, but I'd be really unhappy if I had to type out >>>> 'EXTERNAL' every time. (I have the same concern with 'CREATE EXTERNAL >>>> SCHEMA'.) >>>> >>>> Andrew >>>> >>>> On Tue, Aug 14, 2018 at 12:38 PM Rui Wang <[email protected]> wrote: >>>> >>>>> Hi guys, >>>>> >>>>> I know you are probably using CREATE TABLE, Can I know your thoughts >>>>> on this? >>>>> >>>>> -Rui >>>>> >>>>> >>>>> On Tue, Aug 14, 2018 at 10:22 AM Rui Wang <[email protected]> wrote: >>>>> >>>>>> Thanks Mikhail! "Import" is an alternative option. It might be better. >>>>>> >>>>>> "create external" is being widely used by different systems with >>>>>> similar meaning so "create" usually is ok to external data sources. >>>>>> >>>>>> -Rui >>>>>> >>>>>> On Tue, Aug 14, 2018 at 9:38 AM Mikhail Gryzykhin <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> The idea of clarification sounds good to me. I'd appreciate that >>>>>>> present, when I was triaging post-commit tests. >>>>>>> >>>>>>> Do we have any terms that specify connection to external table? >>>>>>> "CREATE" word triggers this reaction in my brain that there will be a >>>>>>> new >>>>>>> table created. Adding "EXTERNAL" would already add distinction, but >>>>>>> adding >>>>>>> something more explicit for the task might be even better. >>>>>>> >>>>>>> --Mikhail >>>>>>> >>>>>>> Have feedback <http://go/migryz-feedback>? >>>>>>> >>>>>>> >>>>>>> On Mon, Aug 13, 2018 at 2:40 PM Rafael Fernandez < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Strictly speaking, they are not necessarily tables either. We could >>>>>>>> also introduce something like CREATE EXTERNAL DATA SOURCE (a-la >>>>>>>> T-SQL >>>>>>>> <https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-data-source-transact-sql?view=sql-server-2017>), >>>>>>>> if it's somehow advantageous for us to leverage access patterns or >>>>>>>> restrict >>>>>>>> DML statements. >>>>>>>> >>>>>>>> I think your idea of CREATE EXTERNAL TABLE is practical :) >>>>>>>> >>>>>>>> On Mon, Aug 13, 2018 at 2:12 PM Rui Wang <[email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Community, >>>>>>>>> >>>>>>>>> BeamSQL allows CREATE TABLE >>>>>>>>> <https://beam.apache.org/documentation/dsls/sql/create-table/> >>>>>>>>> statements to register virtual tables from external storage systems >>>>>>>>> (e.g. >>>>>>>>> BigQuery). >>>>>>>>> >>>>>>>>> BeamSQL is not a storage system, so any table registered by >>>>>>>>> "CREATE TABLE" statement is essentially equivalent to be registered by >>>>>>>>> "CREATE EXTERNAL TABLE", which requires the user to provide a >>>>>>>>> LOCATION and >>>>>>>>> BeamSQL will register the table outside of current execution >>>>>>>>> environment >>>>>>>>> based on LOCATION. >>>>>>>>> >>>>>>>>> So I propose to add EXTERNAL keyword to "CREATE TABLE" in BeamSQL >>>>>>>>> to help users understand they are registering tables, and BeamSQL >>>>>>>>> does not >>>>>>>>> create non existing tables by running CREATE TABLE (at least on some >>>>>>>>> storage systems, if not all). >>>>>>>>> >>>>>>>>> We can make the EXTERNAL keyword either required or optional. >>>>>>>>> >>>>>>>>> If we make the EXTERNAL keyword required: >>>>>>>>> >>>>>>>>> Pros: >>>>>>>>> a. We can get rid of the registering table semantic on CREATE >>>>>>>>> TABLE. >>>>>>>>> b, We keep the room that we could add CREATE TABLE back in the >>>>>>>>> future if we want CREATE TABLE to create, rather than not only >>>>>>>>> register >>>>>>>>> tables in BeamSQL. >>>>>>>>> >>>>>>>>> Cons: >>>>>>>>> 1. CREATE TABLE syntax will not be supported so existing BeamSQL >>>>>>>>> pipelines which has CREATE TABLE require changes. >>>>>>>>> 2. It's required to type tedious EXTERNAL keyword every time, >>>>>>>>> especially in SQL Shell. >>>>>>>>> >>>>>>>>> If we make the EXTERNAL keyword optional, we will have reversed >>>>>>>>> pros and cons above. >>>>>>>>> >>>>>>>>> Any thoughts on adding EXTERNAL keyword, and make it required or >>>>>>>>> optional? >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Rui >>>>>>>>> >>>>>>>>> >>>>>>>>> >>> >>> >>> -- >>> ---- >>> Mingmin >>> >>
