Few questions/comments inline.

On Thu, Jul 30, 2015 at 2:53 PM, mehant baid <baid.meh...@gmail.com> wrote:

>  Based on the discussion in the hangout I wanted to start a thread around
> Drop table support.
>
> Couple of high level points about what is planned to be supported
>
> 1. In the first iteration Drop table will only support dropping tables in
> the file system and not dropping tables in Hive/ Hbase or other storage
> plugins.
> 2. Since Drop table is potentially "risky" we want to be pessimistic about
> dropping tables.
>
> There are two broad scenarios while dealing with Drop table - Security
> enabled and Security Disabled. In both cases we would like to follow the
> below workflow
>
> 1. Check if the table being dropped can be consumed by Drill.
>
[Neeraja] I am assuming if security is enabled, this is done with the
impersonated user identity. is this accurate.

>     * Meaning do all the files in the directories conform to a format that
> Drill can read (parquet, json, csv etc). Jacques pointed out that if there
> is a bug in this logic where if one of the files in the directory conforms
> to a format that Drill can read we create a DrillTable and error out if we
> encounter other files we cannot read.
>
[Neeraja] What does it mean to create DrillTable here?

>     * The above point can in the worst case entail reading the entire file
> system, if a user issues a drop table command on the root of the file
> system. But its more likely that we will encounter a file that Drill cannot
> read soon and abort the Drop with an error.
>     * Another minor clarification is we consider only those directories to
> be consumable by Drill if they contain file formats that are homogenous and
> can be read by Drill. For eg: we should fail if a user is trying to delete
> a directory that contains both JSON and Parquet files.
>

> 2. Once we have confirmed that the table requested to be dropped contains
> homogenous files which can be read by Drill, we delve into the file
> permissions.
>     * If security is enabled, we impersonate the user issuing the command
> and drop the directory (succeeds if FS allows and user has correct
> permissions).
>     * If security is not enabled, we only drop the directory if all the
> files are owned by the user Drillbit is running as (being pessimistic about
> drop). We should collect this information when checking for homogenous
> files.
>
[Neeraja] Why do we need this check. How is this different from the
impersonated user scenario.

>
> Open Questions:
>
> Views: How do we handle views that were created on top of the dropped
> table. Following are a couple of scenarios we might want to explore
>     * Views are treated as a different entity and its useful for the user
> to have a view definition still in place as the dropped table will be
> replaced with new set of files with the exact schema and existing view
> definition suffices. AFAIK, Oracle and SQL Server have this model and don't
> drop the views if the base table is dropped.
>     * Once the table is dropped, the view definition is no longer needed
> and hence should be dropped automatically. We can probably punt on this
> till we have dotdrill files. With dotdrill files we can maintain some
> information to indicate the views on this table and can drop the views
> implicitly. But given that some of the popular databases don't do this, we
> might want to conform to the standard behavior.
>
[Neeraja] Agree with the recommendation here. It seems we can go with a
simpler approach here i.e treat views as different entity

Also will there any mechanism to recover once you accidentally drop?

> Thanks
> Mehant
>

Reply via email to