Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread John Omernik
Ah good points.  I think this also factors into the Workspace Security
topic I bumped up.  Trying to ensure we have the proper tools to
holistically manage our data environment as presented to the user by Drill
I think is important for any admin.

On Wed, May 25, 2016 at 9:34 AM, Andries Engelbrecht <
aengelbre...@maprtech.com> wrote:

> It is an interesting idea, but may warrant more discussion in the overall
> Drill metadata management.
>
> For example how will it affect other SPs that are not DFS?
> How will it be represented/managed in INFORMATION_SCHEMA when tools are
> used to work with Drill metadata?
>
> I support that this is a good idea, but we need to take all the aspects in
> consideration as Drill is a very powerful tool for data discovery and need
> to consider the overall ecosystem.
>
> --Andries
>
>
> > On May 25, 2016, at 5:05 AM, John Omernik  wrote:
> >
> > Prior to opening a JIRA on this, I was curious what the community
> thought.
> >  I'd like to have a setting for workspaces that would indicate "hidden".
> > (Defaulting to false if not specified to not break any already
> implemented
> > workspace definitions)
> >
> > For example:
> >
> > "workspaces" {
> >   "dev": {
> >   "location": "/mydev",
> >   "writable": true,
> >   "defaultInputFormat": null,
> >   "hidden": true
> >  }
> > }
> >
> > This would have the effect that when running "show schemas" this
> workspace
> > would not show up in the list.
> >
> > Reasoning:  When organizing a large enterprise data
> > lake/ocean/cistern/swamp, limited "functional" options provided to the
> user
> > are better then "all" the options.   For example, as an administrator, I
> > may want to define workspaces to help clarify ETL processes, or service
> > loads that if the user HAS filesystem access they CAN access, however,
> they
> > will never want to, instead, the user would focused on cleaned/enriched
> > data.  My users would rarely use the "cp" plugin, however, I don't want
> to
> > eliminate it.  Basically, it doesn't show in show schema, but it can
> still
> > be used both directly in queries, and through the use command.
> >
> > Another example: I create home schemas based on a home directory of every
> > user.  Users's will know it's there, and can easily access it, however,
> > showing up in "show schemas" doesn't provide value, and just clutters the
> > data returned in the response.  I want to attempt to provide a clean
> > interface and depiction of valuable schemas to my user via workspaces,
> and
> > this small flag, I believe would be a low impact way to do that.
> >
> > I would love discussion on this, if others would find this valuable, I
> will
> > happily make a JIRA.
> >
> > John
>
>


Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Andries Engelbrecht
It is an interesting idea, but may warrant more discussion in the overall Drill 
metadata management.

For example how will it affect other SPs that are not DFS?
How will it be represented/managed in INFORMATION_SCHEMA when tools are used to 
work with Drill metadata?

I support that this is a good idea, but we need to take all the aspects in 
consideration as Drill is a very powerful tool for data discovery and need to 
consider the overall ecosystem.

--Andries


> On May 25, 2016, at 5:05 AM, John Omernik  wrote:
> 
> Prior to opening a JIRA on this, I was curious what the community thought.
>  I'd like to have a setting for workspaces that would indicate "hidden".
> (Defaulting to false if not specified to not break any already implemented
> workspace definitions)
> 
> For example:
> 
> "workspaces" {
>   "dev": {
>   "location": "/mydev",
>   "writable": true,
>   "defaultInputFormat": null,
>   "hidden": true
>  }
> }
> 
> This would have the effect that when running "show schemas" this workspace
> would not show up in the list.
> 
> Reasoning:  When organizing a large enterprise data
> lake/ocean/cistern/swamp, limited "functional" options provided to the user
> are better then "all" the options.   For example, as an administrator, I
> may want to define workspaces to help clarify ETL processes, or service
> loads that if the user HAS filesystem access they CAN access, however, they
> will never want to, instead, the user would focused on cleaned/enriched
> data.  My users would rarely use the "cp" plugin, however, I don't want to
> eliminate it.  Basically, it doesn't show in show schema, but it can still
> be used both directly in queries, and through the use command.
> 
> Another example: I create home schemas based on a home directory of every
> user.  Users's will know it's there, and can easily access it, however,
> showing up in "show schemas" doesn't provide value, and just clutters the
> data returned in the response.  I want to attempt to provide a clean
> interface and depiction of valuable schemas to my user via workspaces, and
> this small flag, I believe would be a low impact way to do that.
> 
> I would love discussion on this, if others would find this valuable, I will
> happily make a JIRA.
> 
> John



Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Charles Givre
+2
I really like this idea.  
—C

> On May 25, 2016, at 08:52, Jim Scott  wrote:
> 
> +1
> 
> On Wed, May 25, 2016 at 7:05 AM, John Omernik  wrote:
> 
>> Prior to opening a JIRA on this, I was curious what the community thought.
>>  I'd like to have a setting for workspaces that would indicate "hidden".
>> (Defaulting to false if not specified to not break any already implemented
>> workspace definitions)
>> 
>> For example:
>> 
>> "workspaces" {
>>   "dev": {
>>   "location": "/mydev",
>>   "writable": true,
>>   "defaultInputFormat": null,
>>   "hidden": true
>>  }
>> }
>> 
>> This would have the effect that when running "show schemas" this workspace
>> would not show up in the list.
>> 
>> Reasoning:  When organizing a large enterprise data
>> lake/ocean/cistern/swamp, limited "functional" options provided to the user
>> are better then "all" the options.   For example, as an administrator, I
>> may want to define workspaces to help clarify ETL processes, or service
>> loads that if the user HAS filesystem access they CAN access, however, they
>> will never want to, instead, the user would focused on cleaned/enriched
>> data.  My users would rarely use the "cp" plugin, however, I don't want to
>> eliminate it.  Basically, it doesn't show in show schema, but it can still
>> be used both directly in queries, and through the use command.
>> 
>> Another example: I create home schemas based on a home directory of every
>> user.  Users's will know it's there, and can easily access it, however,
>> showing up in "show schemas" doesn't provide value, and just clutters the
>> data returned in the response.  I want to attempt to provide a clean
>> interface and depiction of valuable schemas to my user via workspaces, and
>> this small flag, I believe would be a low impact way to do that.
>> 
>> I would love discussion on this, if others would find this valuable, I will
>> happily make a JIRA.
>> 
>> John
>> 
> 
> 
> 
> -- 
> *Jim Scott*
> Director, Enterprise Strategy & Architecture
> +1 (347) 746-9281
> @kingmesal 
> 
> 
> [image: MapR Technologies] 
> 
> Now Available - Free Hadoop On-Demand Training
> 



Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Jim Scott
+1

On Wed, May 25, 2016 at 7:05 AM, John Omernik  wrote:

> Prior to opening a JIRA on this, I was curious what the community thought.
>   I'd like to have a setting for workspaces that would indicate "hidden".
>  (Defaulting to false if not specified to not break any already implemented
> workspace definitions)
>
> For example:
>
> "workspaces" {
>"dev": {
>"location": "/mydev",
>"writable": true,
>"defaultInputFormat": null,
>"hidden": true
>   }
>  }
>
> This would have the effect that when running "show schemas" this workspace
> would not show up in the list.
>
> Reasoning:  When organizing a large enterprise data
> lake/ocean/cistern/swamp, limited "functional" options provided to the user
> are better then "all" the options.   For example, as an administrator, I
> may want to define workspaces to help clarify ETL processes, or service
> loads that if the user HAS filesystem access they CAN access, however, they
> will never want to, instead, the user would focused on cleaned/enriched
> data.  My users would rarely use the "cp" plugin, however, I don't want to
> eliminate it.  Basically, it doesn't show in show schema, but it can still
> be used both directly in queries, and through the use command.
>
> Another example: I create home schemas based on a home directory of every
> user.  Users's will know it's there, and can easily access it, however,
> showing up in "show schemas" doesn't provide value, and just clutters the
> data returned in the response.  I want to attempt to provide a clean
> interface and depiction of valuable schemas to my user via workspaces, and
> this small flag, I believe would be a low impact way to do that.
>
> I would love discussion on this, if others would find this valuable, I will
> happily make a JIRA.
>
> John
>



-- 
*Jim Scott*
Director, Enterprise Strategy & Architecture
+1 (347) 746-9281
@kingmesal 


[image: MapR Technologies] 

Now Available - Free Hadoop On-Demand Training



Discussion - "Hidden" Workspaces

2016-05-25 Thread John Omernik
Prior to opening a JIRA on this, I was curious what the community thought.
  I'd like to have a setting for workspaces that would indicate "hidden".
 (Defaulting to false if not specified to not break any already implemented
workspace definitions)

For example:

"workspaces" {
   "dev": {
   "location": "/mydev",
   "writable": true,
   "defaultInputFormat": null,
   "hidden": true
  }
 }

This would have the effect that when running "show schemas" this workspace
would not show up in the list.

Reasoning:  When organizing a large enterprise data
lake/ocean/cistern/swamp, limited "functional" options provided to the user
are better then "all" the options.   For example, as an administrator, I
may want to define workspaces to help clarify ETL processes, or service
loads that if the user HAS filesystem access they CAN access, however, they
will never want to, instead, the user would focused on cleaned/enriched
data.  My users would rarely use the "cp" plugin, however, I don't want to
eliminate it.  Basically, it doesn't show in show schema, but it can still
be used both directly in queries, and through the use command.

Another example: I create home schemas based on a home directory of every
user.  Users's will know it's there, and can easily access it, however,
showing up in "show schemas" doesn't provide value, and just clutters the
data returned in the response.  I want to attempt to provide a clean
interface and depiction of valuable schemas to my user via workspaces, and
this small flag, I believe would be a low impact way to do that.

I would love discussion on this, if others would find this valuable, I will
happily make a JIRA.

John