Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-13 Thread Abhishek Girish
The issue is that for those customers who do have such storage plugin
names, it's too late to rename after an offline upgrade - as there is no
easy way to access the storage plugin configurations if Drillbits are down
(due to Drillbit start-up failing). Might be okay, if admins perform a
rolling upgrade (newer Drillbits would fail, but older Drillbits can be
used to update storage plugin config), but that's not fully supported.
Ideally, we'll need to find a way to not fail startup, instead disable the
plugins which have issues, but if that's a complex and separate task, for
now we should perhaps clearly document that this would be a breaking change
after upgrade, so users should fix the plugins before they proceed.

On Wed, Jun 13, 2018 at 3:42 AM Arina Yelchiyeva 
wrote:

> From the Drill code workspaces are already case insensitive (though the
> documentation states the opposite). Since there were no complaints from the
> users so far, I believe there are not many (if any) who uses the same names
> in different case.
> Regarding those users that already have duplicating storage plugins names,
> after the change Drill start up will fail with appropriate error message
> and they would have to rename those storage plugins.
>
> Kind regards,
> Arina
>
>
> On Tue, Jun 12, 2018 at 8:45 PM Abhishek Girish 
> wrote:
>
> > Paul, I think this proposal was specific to storage plugin and workspace
> > *names*. And not for the whole of Drill.
> >
> > I agree it makes sense to have these names case insensitive, to improve
> > user experience. The only impact to current users I can think of is if
> > someone created two storage plugins dfs and DFS. Or configured workspaces
> > tmp and TMP. In this case, they'd need to rename those. One thing I'm not
> > clear on is how we'll handle upgrades in these cases.
> >
> > On Tue, Jun 12, 2018 at 10:31 AM Paul Rogers 
> > wrote:
> >
> > > Hi All,
> > >
> > > As it turns out, this topic has been discussed, in depth, previously.
> > > Can't recall if it was on this list, or in a JIRA.
> > >
> > > We face a number of constraints:
> > >
> > > * As was noted, for some data sources, the data source itself has case
> > > insensitive names. (Windows file systems, RDBMSs, etc.)
> > > * In other cases, the data source itself has case sensitive names.
> (HDFS
> > > file system, Linux file systems, JSON, etc.)
> > > * SQL is defined to be case insensitive.
> > > * We now have several years of user queries, in production, based on
> the
> > > current semantics.
> > >
> > > Given all this, it is very likely that simply shifting to
> case-sensitive
> > > will break existing applications.
> > >
> > > Perhaps a more subtle solution is to make the case-sensitivity a
> property
> > > of the symbol that is carried through the query pipeline as another
> piece
> > > of metadata.
> > >
> > > Thus, a workspace that corresponds to a DB schema would be labeled as
> > case
> > > insensitive. A workspace that corresponds to an HDFS directory would be
> > > case sensitive. Names defined within Drill (as part of an AS clause),
> > would
> > > follow SQL rules and be case insensitive.
> > >
> > > I believe that, if we sit down and work out exactly what users would
> > > expect, and what is required to handle both case sensitive and case
> > > insensitive names, we'll end up with a solution not far from the above
> --
> > > out of simple necessity.
> > >
> > > Thanks,
> > > - Paul
> > >
> > >
> > >
> > > On Tuesday, June 12, 2018, 8:36:01 AM PDT, Arina Yelchiyeva <
> > > arina.yelchiy...@gmail.com> wrote:
> > >
> > >  To make it clear we have three notions here: storage plugin name,
> > > workspace
> > > (schema) and table name (dfs.root.`/tmp/t`).
> > > My suggestion is the following:
> > > Storage plugin names to be case insensitive (DFS vs dfs,
> > INFORMATION_SCHEMA
> > > vs information_schema).
> > > Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
> > > tmp). Even if user has two directories /TMP and /tmp, he can create two
> > > workspaces but not both with tmp name. For example, tmp vs tmp_u.
> > > Table names case sensitivity are treated per plugin. For example,
> system
> > > plugins (information_schema, sys) table names (views, tables) should be
> > > case insensitive. Actually, currently for sys plugin table names are
> case
> > > insensit

Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-13 Thread Arina Yelchiyeva
>From the Drill code workspaces are already case insensitive (though the
documentation states the opposite). Since there were no complaints from the
users so far, I believe there are not many (if any) who uses the same names
in different case.
Regarding those users that already have duplicating storage plugins names,
after the change Drill start up will fail with appropriate error message
and they would have to rename those storage plugins.

Kind regards,
Arina


On Tue, Jun 12, 2018 at 8:45 PM Abhishek Girish  wrote:

> Paul, I think this proposal was specific to storage plugin and workspace
> *names*. And not for the whole of Drill.
>
> I agree it makes sense to have these names case insensitive, to improve
> user experience. The only impact to current users I can think of is if
> someone created two storage plugins dfs and DFS. Or configured workspaces
> tmp and TMP. In this case, they'd need to rename those. One thing I'm not
> clear on is how we'll handle upgrades in these cases.
>
> On Tue, Jun 12, 2018 at 10:31 AM Paul Rogers 
> wrote:
>
> > Hi All,
> >
> > As it turns out, this topic has been discussed, in depth, previously.
> > Can't recall if it was on this list, or in a JIRA.
> >
> > We face a number of constraints:
> >
> > * As was noted, for some data sources, the data source itself has case
> > insensitive names. (Windows file systems, RDBMSs, etc.)
> > * In other cases, the data source itself has case sensitive names. (HDFS
> > file system, Linux file systems, JSON, etc.)
> > * SQL is defined to be case insensitive.
> > * We now have several years of user queries, in production, based on the
> > current semantics.
> >
> > Given all this, it is very likely that simply shifting to case-sensitive
> > will break existing applications.
> >
> > Perhaps a more subtle solution is to make the case-sensitivity a property
> > of the symbol that is carried through the query pipeline as another piece
> > of metadata.
> >
> > Thus, a workspace that corresponds to a DB schema would be labeled as
> case
> > insensitive. A workspace that corresponds to an HDFS directory would be
> > case sensitive. Names defined within Drill (as part of an AS clause),
> would
> > follow SQL rules and be case insensitive.
> >
> > I believe that, if we sit down and work out exactly what users would
> > expect, and what is required to handle both case sensitive and case
> > insensitive names, we'll end up with a solution not far from the above --
> > out of simple necessity.
> >
> > Thanks,
> > - Paul
> >
> >
> >
> > On Tuesday, June 12, 2018, 8:36:01 AM PDT, Arina Yelchiyeva <
> > arina.yelchiy...@gmail.com> wrote:
> >
> >  To make it clear we have three notions here: storage plugin name,
> > workspace
> > (schema) and table name (dfs.root.`/tmp/t`).
> > My suggestion is the following:
> > Storage plugin names to be case insensitive (DFS vs dfs,
> INFORMATION_SCHEMA
> > vs information_schema).
> > Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
> > tmp). Even if user has two directories /TMP and /tmp, he can create two
> > workspaces but not both with tmp name. For example, tmp vs tmp_u.
> > Table names case sensitivity are treated per plugin. For example, system
> > plugins (information_schema, sys) table names (views, tables) should be
> > case insensitive. Actually, currently for sys plugin table names are case
> > insensitive, information_schema table names are case sensitive. That
> needs
> > to be synchronized. For file system plugins table names must be case
> > sensitive, since under table name we imply directory / file name and
> their
> > case sensitivity depends on file system.
> >
> > Kind regards,
> > Arina
> >
> > On Tue, Jun 12, 2018 at 6:13 PM Aman Sinha  wrote:
> >
> > > Drill is dependent on the underlying file system's case sensitivity.
> On
> > > HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which
> > are
> > > separate directories.
> > > These could be set as workspace in Drill's storage plugin configuration
> > and
> > > we would want the ability to query both.  If we change the current
> > > behavior, we would want
> > > some way, either using back-quotes `  or other way to support that.
> > >
> > > RDBMSs seem to have vendor-specific behavior...
> > > In MySQL [1] the database name and schema name are case-sensitive on
> > Linux
> > > and case-insensitive on Windows.  Whereas in Postgres it converts the
> > &

Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-12 Thread Abhishek Girish
Paul, I think this proposal was specific to storage plugin and workspace
*names*. And not for the whole of Drill.

I agree it makes sense to have these names case insensitive, to improve
user experience. The only impact to current users I can think of is if
someone created two storage plugins dfs and DFS. Or configured workspaces
tmp and TMP. In this case, they'd need to rename those. One thing I'm not
clear on is how we'll handle upgrades in these cases.

On Tue, Jun 12, 2018 at 10:31 AM Paul Rogers 
wrote:

> Hi All,
>
> As it turns out, this topic has been discussed, in depth, previously.
> Can't recall if it was on this list, or in a JIRA.
>
> We face a number of constraints:
>
> * As was noted, for some data sources, the data source itself has case
> insensitive names. (Windows file systems, RDBMSs, etc.)
> * In other cases, the data source itself has case sensitive names. (HDFS
> file system, Linux file systems, JSON, etc.)
> * SQL is defined to be case insensitive.
> * We now have several years of user queries, in production, based on the
> current semantics.
>
> Given all this, it is very likely that simply shifting to case-sensitive
> will break existing applications.
>
> Perhaps a more subtle solution is to make the case-sensitivity a property
> of the symbol that is carried through the query pipeline as another piece
> of metadata.
>
> Thus, a workspace that corresponds to a DB schema would be labeled as case
> insensitive. A workspace that corresponds to an HDFS directory would be
> case sensitive. Names defined within Drill (as part of an AS clause), would
> follow SQL rules and be case insensitive.
>
> I believe that, if we sit down and work out exactly what users would
> expect, and what is required to handle both case sensitive and case
> insensitive names, we'll end up with a solution not far from the above --
> out of simple necessity.
>
> Thanks,
> - Paul
>
>
>
> On Tuesday, June 12, 2018, 8:36:01 AM PDT, Arina Yelchiyeva <
> arina.yelchiy...@gmail.com> wrote:
>
>  To make it clear we have three notions here: storage plugin name,
> workspace
> (schema) and table name (dfs.root.`/tmp/t`).
> My suggestion is the following:
> Storage plugin names to be case insensitive (DFS vs dfs, INFORMATION_SCHEMA
> vs information_schema).
> Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
> tmp). Even if user has two directories /TMP and /tmp, he can create two
> workspaces but not both with tmp name. For example, tmp vs tmp_u.
> Table names case sensitivity are treated per plugin. For example, system
> plugins (information_schema, sys) table names (views, tables) should be
> case insensitive. Actually, currently for sys plugin table names are case
> insensitive, information_schema table names are case sensitive. That needs
> to be synchronized. For file system plugins table names must be case
> sensitive, since under table name we imply directory / file name and their
> case sensitivity depends on file system.
>
> Kind regards,
> Arina
>
> On Tue, Jun 12, 2018 at 6:13 PM Aman Sinha  wrote:
>
> > Drill is dependent on the underlying file system's case sensitivity.  On
> > HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which
> are
> > separate directories.
> > These could be set as workspace in Drill's storage plugin configuration
> and
> > we would want the ability to query both.  If we change the current
> > behavior, we would want
> > some way, either using back-quotes `  or other way to support that.
> >
> > RDBMSs seem to have vendor-specific behavior...
> > In MySQL [1] the database name and schema name are case-sensitive on
> Linux
> > and case-insensitive on Windows.  Whereas in Postgres it converts the
> > database name and schema name to lower-case by default but one can put
> > double-quotes to make it case-sensitive [2].
> >
> > [1]
> > https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
> > [2]
> >
> http://www.postgresqlforbeginners.com/2010/11/gotcha-case-sensitivity.html
> >
> >
> >
> > On Tue, Jun 12, 2018 at 5:01 AM, Arina Yelchiyeva <
> > arina.yelchiy...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Currently Drill we treat storage plugin names and workspaces as
> > > case-sensitive [1].
> > > Names for storage plugins and workspaces are defined by the user. So we
> > > allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
> > > I have a suggestion to move to case insensitive approach and won't
> allow
> > > creating two plugins / workspaces with the same name in differ

Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-12 Thread Paul Rogers
Hi All,

As it turns out, this topic has been discussed, in depth, previously. Can't 
recall if it was on this list, or in a JIRA.

We face a number of constraints:

* As was noted, for some data sources, the data source itself has case 
insensitive names. (Windows file systems, RDBMSs, etc.)
* In other cases, the data source itself has case sensitive names. (HDFS file 
system, Linux file systems, JSON, etc.)
* SQL is defined to be case insensitive.
* We now have several years of user queries, in production, based on the 
current semantics.

Given all this, it is very likely that simply shifting to case-sensitive will 
break existing applications.

Perhaps a more subtle solution is to make the case-sensitivity a property of 
the symbol that is carried through the query pipeline as another piece of 
metadata.

Thus, a workspace that corresponds to a DB schema would be labeled as case 
insensitive. A workspace that corresponds to an HDFS directory would be case 
sensitive. Names defined within Drill (as part of an AS clause), would follow 
SQL rules and be case insensitive.

I believe that, if we sit down and work out exactly what users would expect, 
and what is required to handle both case sensitive and case insensitive names, 
we'll end up with a solution not far from the above -- out of simple necessity.

Thanks,
- Paul

 

On Tuesday, June 12, 2018, 8:36:01 AM PDT, Arina Yelchiyeva 
 wrote:  
 
 To make it clear we have three notions here: storage plugin name, workspace
(schema) and table name (dfs.root.`/tmp/t`).
My suggestion is the following:
Storage plugin names to be case insensitive (DFS vs dfs, INFORMATION_SCHEMA
vs information_schema).
Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
tmp). Even if user has two directories /TMP and /tmp, he can create two
workspaces but not both with tmp name. For example, tmp vs tmp_u.
Table names case sensitivity are treated per plugin. For example, system
plugins (information_schema, sys) table names (views, tables) should be
case insensitive. Actually, currently for sys plugin table names are case
insensitive, information_schema table names are case sensitive. That needs
to be synchronized. For file system plugins table names must be case
sensitive, since under table name we imply directory / file name and their
case sensitivity depends on file system.

Kind regards,
Arina

On Tue, Jun 12, 2018 at 6:13 PM Aman Sinha  wrote:

> Drill is dependent on the underlying file system's case sensitivity.  On
> HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which are
> separate directories.
> These could be set as workspace in Drill's storage plugin configuration and
> we would want the ability to query both.  If we change the current
> behavior, we would want
> some way, either using back-quotes `  or other way to support that.
>
> RDBMSs seem to have vendor-specific behavior...
> In MySQL [1] the database name and schema name are case-sensitive on Linux
> and case-insensitive on Windows.  Whereas in Postgres it converts the
> database name and schema name to lower-case by default but one can put
> double-quotes to make it case-sensitive [2].
>
> [1]
> https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
> [2]
> http://www.postgresqlforbeginners.com/2010/11/gotcha-case-sensitivity.html
>
>
>
> On Tue, Jun 12, 2018 at 5:01 AM, Arina Yelchiyeva <
> arina.yelchiy...@gmail.com> wrote:
>
> > Hi all,
> >
> > Currently Drill we treat storage plugin names and workspaces as
> > case-sensitive [1].
> > Names for storage plugins and workspaces are defined by the user. So we
> > allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
> > I have a suggestion to move to case insensitive approach and won't allow
> > creating two plugins / workspaces with the same name in different case at
> > least for the following reasons:
> > 1. usually rdbms schema and table names are case insensitive and many
> users
> > are used to this approach;
> > 2. in Drill we have INFORMATION_SCHEMA schema which is in upper case, sys
> > in lower case.
> > personally I find it's extremely inconvenient.
> >
> > Also we should consider making table names case insensitive for system
> > schemas (info, sys).
> >
> > Any thoughts?
> >
> > [1] https://drill.apache.org/docs/lexical-structure/
> >
> >
> > Kind regards,
> > Arina
> >
>
  

Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-12 Thread Aman Sinha
Yes, that seems ok to me...since the plugin name and workspace are logical
entities and don't correspond to a path.
There could be compatibility issues if certain users have relied on the
case-sensitive names, but those would be temporary.

Aman

On Tue, Jun 12, 2018 at 8:35 AM, Arina Yelchiyeva <
arina.yelchiy...@gmail.com> wrote:

> To make it clear we have three notions here: storage plugin name, workspace
> (schema) and table name (dfs.root.`/tmp/t`).
> My suggestion is the following:
> Storage plugin names to be case insensitive (DFS vs dfs, INFORMATION_SCHEMA
> vs information_schema).
> Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
> tmp). Even if user has two directories /TMP and /tmp, he can create two
> workspaces but not both with tmp name. For example, tmp vs tmp_u.
> Table names case sensitivity are treated per plugin. For example, system
> plugins (information_schema, sys) table names (views, tables) should be
> case insensitive. Actually, currently for sys plugin table names are case
> insensitive, information_schema table names are case sensitive. That needs
> to be synchronized. For file system plugins table names must be case
> sensitive, since under table name we imply directory / file name and their
> case sensitivity depends on file system.
>
> Kind regards,
> Arina
>
> On Tue, Jun 12, 2018 at 6:13 PM Aman Sinha  wrote:
>
> > Drill is dependent on the underlying file system's case sensitivity.  On
> > HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which
> are
> > separate directories.
> > These could be set as workspace in Drill's storage plugin configuration
> and
> > we would want the ability to query both.   If we change the current
> > behavior, we would want
> > some way, either using back-quotes `  or other way to support that.
> >
> > RDBMSs seem to have vendor-specific behavior...
> > In MySQL [1] the database name and schema name are case-sensitive on
> Linux
> > and case-insensitive on Windows.   Whereas in Postgres it converts the
> > database name and schema name to lower-case by default but one can put
> > double-quotes to make it case-sensitive [2].
> >
> > [1]
> > https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
> > [2]
> > http://www.postgresqlforbeginners.com/2010/11/gotcha-case-
> sensitivity.html
> >
> >
> >
> > On Tue, Jun 12, 2018 at 5:01 AM, Arina Yelchiyeva <
> > arina.yelchiy...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > Currently Drill we treat storage plugin names and workspaces as
> > > case-sensitive [1].
> > > Names for storage plugins and workspaces are defined by the user. So we
> > > allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
> > > I have a suggestion to move to case insensitive approach and won't
> allow
> > > creating two plugins / workspaces with the same name in different case
> at
> > > least for the following reasons:
> > > 1. usually rdbms schema and table names are case insensitive and many
> > users
> > > are used to this approach;
> > > 2. in Drill we have INFORMATION_SCHEMA schema which is in upper case,
> sys
> > > in lower case.
> > > personally I find it's extremely inconvenient.
> > >
> > > Also we should consider making table names case insensitive for system
> > > schemas (info, sys).
> > >
> > > Any thoughts?
> > >
> > > [1] https://drill.apache.org/docs/lexical-structure/
> > >
> > >
> > > Kind regards,
> > > Arina
> > >
> >
>


Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-12 Thread Arina Yelchiyeva
To make it clear we have three notions here: storage plugin name, workspace
(schema) and table name (dfs.root.`/tmp/t`).
My suggestion is the following:
Storage plugin names to be case insensitive (DFS vs dfs, INFORMATION_SCHEMA
vs information_schema).
Workspace  (schemas) names to be case insensitive (ROOT vs root, TMP vs
tmp). Even if user has two directories /TMP and /tmp, he can create two
workspaces but not both with tmp name. For example, tmp vs tmp_u.
Table names case sensitivity are treated per plugin. For example, system
plugins (information_schema, sys) table names (views, tables) should be
case insensitive. Actually, currently for sys plugin table names are case
insensitive, information_schema table names are case sensitive. That needs
to be synchronized. For file system plugins table names must be case
sensitive, since under table name we imply directory / file name and their
case sensitivity depends on file system.

Kind regards,
Arina

On Tue, Jun 12, 2018 at 6:13 PM Aman Sinha  wrote:

> Drill is dependent on the underlying file system's case sensitivity.  On
> HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which are
> separate directories.
> These could be set as workspace in Drill's storage plugin configuration and
> we would want the ability to query both.   If we change the current
> behavior, we would want
> some way, either using back-quotes `  or other way to support that.
>
> RDBMSs seem to have vendor-specific behavior...
> In MySQL [1] the database name and schema name are case-sensitive on Linux
> and case-insensitive on Windows.   Whereas in Postgres it converts the
> database name and schema name to lower-case by default but one can put
> double-quotes to make it case-sensitive [2].
>
> [1]
> https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
> [2]
> http://www.postgresqlforbeginners.com/2010/11/gotcha-case-sensitivity.html
>
>
>
> On Tue, Jun 12, 2018 at 5:01 AM, Arina Yelchiyeva <
> arina.yelchiy...@gmail.com> wrote:
>
> > Hi all,
> >
> > Currently Drill we treat storage plugin names and workspaces as
> > case-sensitive [1].
> > Names for storage plugins and workspaces are defined by the user. So we
> > allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
> > I have a suggestion to move to case insensitive approach and won't allow
> > creating two plugins / workspaces with the same name in different case at
> > least for the following reasons:
> > 1. usually rdbms schema and table names are case insensitive and many
> users
> > are used to this approach;
> > 2. in Drill we have INFORMATION_SCHEMA schema which is in upper case, sys
> > in lower case.
> > personally I find it's extremely inconvenient.
> >
> > Also we should consider making table names case insensitive for system
> > schemas (info, sys).
> >
> > Any thoughts?
> >
> > [1] https://drill.apache.org/docs/lexical-structure/
> >
> >
> > Kind regards,
> > Arina
> >
>


Re: [DISCUSS] case insensitive storage plugin and workspaces names

2018-06-12 Thread Aman Sinha
Drill is dependent on the underlying file system's case sensitivity.  On
HDFS one can create  'hadoop fs -mkdir /tmp/TPCH'  and /tmp/tpch which are
separate directories.
These could be set as workspace in Drill's storage plugin configuration and
we would want the ability to query both.   If we change the current
behavior, we would want
some way, either using back-quotes `  or other way to support that.

RDBMSs seem to have vendor-specific behavior...
In MySQL [1] the database name and schema name are case-sensitive on Linux
and case-insensitive on Windows.   Whereas in Postgres it converts the
database name and schema name to lower-case by default but one can put
double-quotes to make it case-sensitive [2].

[1] https://dev.mysql.com/doc/refman/8.0/en/identifier-case-sensitivity.html
[2]
http://www.postgresqlforbeginners.com/2010/11/gotcha-case-sensitivity.html



On Tue, Jun 12, 2018 at 5:01 AM, Arina Yelchiyeva <
arina.yelchiy...@gmail.com> wrote:

> Hi all,
>
> Currently Drill we treat storage plugin names and workspaces as
> case-sensitive [1].
> Names for storage plugins and workspaces are defined by the user. So we
> allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
> I have a suggestion to move to case insensitive approach and won't allow
> creating two plugins / workspaces with the same name in different case at
> least for the following reasons:
> 1. usually rdbms schema and table names are case insensitive and many users
> are used to this approach;
> 2. in Drill we have INFORMATION_SCHEMA schema which is in upper case, sys
> in lower case.
> personally I find it's extremely inconvenient.
>
> Also we should consider making table names case insensitive for system
> schemas (info, sys).
>
> Any thoughts?
>
> [1] https://drill.apache.org/docs/lexical-structure/
>
>
> Kind regards,
> Arina
>


[DISCUSS] case insensitive storage plugin and workspaces names

2018-06-12 Thread Arina Yelchiyeva
Hi all,

Currently Drill we treat storage plugin names and workspaces as
case-sensitive [1].
Names for storage plugins and workspaces are defined by the user. So we
allow to create plugin -> DFS and dfs, workspace -> tmp and TMP.
I have a suggestion to move to case insensitive approach and won't allow
creating two plugins / workspaces with the same name in different case at
least for the following reasons:
1. usually rdbms schema and table names are case insensitive and many users
are used to this approach;
2. in Drill we have INFORMATION_SCHEMA schema which is in upper case, sys
in lower case.
personally I find it's extremely inconvenient.

Also we should consider making table names case insensitive for system
schemas (info, sys).

Any thoughts?

[1] https://drill.apache.org/docs/lexical-structure/


Kind regards,
Arina


Re: Visibility of Workspaces or Views

2017-07-13 Thread Andries Engelbrecht
Normally workspaces in Drill DFS plugin should be tied to a directory in the 
underlying DFS.

If the user/group that logged in does not have read/write/exec permissions on 
the directory, it shouldn’t show up in the show schemas nor should the user be 
able to select the workspace in Drill.

Did a quick test and the “enduser” (not part of analyst group and not the 
analyst user) was not able to see masterviews workspace tied to directory 
views_master in the DFS. The masterviews workspace did not show with show 
schemas.

drwxr-x---. 2 analyst analyst 6 Jul  6 20:45 views_master

Plugin info

"masterviews": {
  "location": "/data/views_master",
  "writable": true,
  "defaultInputFormat": null
},


If I try to use it as the “enduser”

use dfs.masterviews;

Error: [MapR][DrillJDBCDriver](500165) Query execution error: 
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: 
Schema [dfs.masterviews] is not valid with respect to either root schema or 
current default schema.

Current default schema:  No default schema selected

Also if I try to use a view in the directory directly from Drill it reports 
back table not found.



Similarly the file permissions will limit access to use the view, but in my 
experience the views do show up for user (which causes other issue). However if 
the directory permissions are set correctly the workspace does not come up for 
the user or group without the correct permissions. I would still recommend to 
set view file permissions correctly as well.

In this case I used POSIX file/dir permissions, but some more advanced DFS 
systems can handle ACEs, etc which makes it much more workable in large complex 
user/schema environments.

See if this solves your issue.

--Andries


On 7/13/17, 5:04 PM, "Paul Rogers" <prog...@mapr.com> wrote:

Hi Francis,
    
I don’t believe that Drill currently has a way to do this: workspaces are 
global resources shared across all users. The admin (only) can edit all plugin 
and workspace definitions.

We’d need some form of security tags on each definition, and a way to map 
users to tags to make this work, but Drill does not yet have this feature.

That said, I believe that, while the workspace itself is public, the files 
in the workspace can be restricted using file permissions when you enable 
impersonation in Drill. [1]

- Paul

[1] https://drill.apache.org/docs/securing-drill-introduction/

> On Jul 13, 2017, at 1:09 PM, Francis McGregor-Macdonald 
<fran...@mc-mac.com> wrote:
> 
> Hi,
    > 
> I have a situation where I would like to restrict access to workspaces
> based on the user. I have an instance where I would like to allow some
> third-party access to a subset of views. I can't find a standard method
> here.
> 
> The only similar issue I could find was this:
> https://issues.apache.org/jira/browse/DRILL-3467
> 
> Is there a standard practice here to limit workspaces for users?
> 
> Thanks,
> Francis





Re: Visibility of Workspaces or Views

2017-07-13 Thread Paul Rogers
Hi Francis,

I don’t believe that Drill currently has a way to do this: workspaces are 
global resources shared across all users. The admin (only) can edit all plugin 
and workspace definitions.

We’d need some form of security tags on each definition, and a way to map users 
to tags to make this work, but Drill does not yet have this feature.

That said, I believe that, while the workspace itself is public, the files in 
the workspace can be restricted using file permissions when you enable 
impersonation in Drill. [1]

- Paul

[1] https://drill.apache.org/docs/securing-drill-introduction/

> On Jul 13, 2017, at 1:09 PM, Francis McGregor-Macdonald <fran...@mc-mac.com> 
> wrote:
> 
> Hi,
> 
> I have a situation where I would like to restrict access to workspaces
> based on the user. I have an instance where I would like to allow some
> third-party access to a subset of views. I can't find a standard method
> here.
> 
> The only similar issue I could find was this:
> https://issues.apache.org/jira/browse/DRILL-3467
> 
> Is there a standard practice here to limit workspaces for users?
> 
> Thanks,
> Francis



Visibility of Workspaces or Views

2017-07-13 Thread Francis McGregor-Macdonald
Hi,

I have a situation where I would like to restrict access to workspaces
based on the user. I have an instance where I would like to allow some
third-party access to a subset of views. I can't find a standard method
here.

The only similar issue I could find was this:
https://issues.apache.org/jira/browse/DRILL-3467

Is there a standard practice here to limit workspaces for users?

Thanks,
Francis


Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread John Omernik
Ah good points.  I think this also factors into the Workspace Security
topic I bumped up.  Trying to ensure we have the proper tools to
holistically manage our data environment as presented to the user by Drill
I think is important for any admin.

On Wed, May 25, 2016 at 9:34 AM, Andries Engelbrecht <
aengelbre...@maprtech.com> wrote:

> It is an interesting idea, but may warrant more discussion in the overall
> Drill metadata management.
>
> For example how will it affect other SPs that are not DFS?
> How will it be represented/managed in INFORMATION_SCHEMA when tools are
> used to work with Drill metadata?
>
> I support that this is a good idea, but we need to take all the aspects in
> consideration as Drill is a very powerful tool for data discovery and need
> to consider the overall ecosystem.
>
> --Andries
>
>
> > On May 25, 2016, at 5:05 AM, John Omernik <j...@omernik.com> wrote:
> >
> > Prior to opening a JIRA on this, I was curious what the community
> thought.
> >  I'd like to have a setting for workspaces that would indicate "hidden".
> > (Defaulting to false if not specified to not break any already
> implemented
> > workspace definitions)
> >
> > For example:
> >
> > "workspaces" {
> >   "dev": {
> >   "location": "/mydev",
> >   "writable": true,
> >   "defaultInputFormat": null,
> >   "hidden": true
> >  }
> > }
> >
> > This would have the effect that when running "show schemas" this
> workspace
> > would not show up in the list.
> >
> > Reasoning:  When organizing a large enterprise data
> > lake/ocean/cistern/swamp, limited "functional" options provided to the
> user
> > are better then "all" the options.   For example, as an administrator, I
> > may want to define workspaces to help clarify ETL processes, or service
> > loads that if the user HAS filesystem access they CAN access, however,
> they
> > will never want to, instead, the user would focused on cleaned/enriched
> > data.  My users would rarely use the "cp" plugin, however, I don't want
> to
> > eliminate it.  Basically, it doesn't show in show schema, but it can
> still
> > be used both directly in queries, and through the use command.
> >
> > Another example: I create home schemas based on a home directory of every
> > user.  Users's will know it's there, and can easily access it, however,
> > showing up in "show schemas" doesn't provide value, and just clutters the
> > data returned in the response.  I want to attempt to provide a clean
> > interface and depiction of valuable schemas to my user via workspaces,
> and
> > this small flag, I believe would be a low impact way to do that.
> >
> > I would love discussion on this, if others would find this valuable, I
> will
> > happily make a JIRA.
> >
> > John
>
>


Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Andries Engelbrecht
It is an interesting idea, but may warrant more discussion in the overall Drill 
metadata management.

For example how will it affect other SPs that are not DFS?
How will it be represented/managed in INFORMATION_SCHEMA when tools are used to 
work with Drill metadata?

I support that this is a good idea, but we need to take all the aspects in 
consideration as Drill is a very powerful tool for data discovery and need to 
consider the overall ecosystem.

--Andries


> On May 25, 2016, at 5:05 AM, John Omernik <j...@omernik.com> wrote:
> 
> Prior to opening a JIRA on this, I was curious what the community thought.
>  I'd like to have a setting for workspaces that would indicate "hidden".
> (Defaulting to false if not specified to not break any already implemented
> workspace definitions)
> 
> For example:
> 
> "workspaces" {
>   "dev": {
>   "location": "/mydev",
>   "writable": true,
>   "defaultInputFormat": null,
>   "hidden": true
>  }
> }
> 
> This would have the effect that when running "show schemas" this workspace
> would not show up in the list.
> 
> Reasoning:  When organizing a large enterprise data
> lake/ocean/cistern/swamp, limited "functional" options provided to the user
> are better then "all" the options.   For example, as an administrator, I
> may want to define workspaces to help clarify ETL processes, or service
> loads that if the user HAS filesystem access they CAN access, however, they
> will never want to, instead, the user would focused on cleaned/enriched
> data.  My users would rarely use the "cp" plugin, however, I don't want to
> eliminate it.  Basically, it doesn't show in show schema, but it can still
> be used both directly in queries, and through the use command.
> 
> Another example: I create home schemas based on a home directory of every
> user.  Users's will know it's there, and can easily access it, however,
> showing up in "show schemas" doesn't provide value, and just clutters the
> data returned in the response.  I want to attempt to provide a clean
> interface and depiction of valuable schemas to my user via workspaces, and
> this small flag, I believe would be a low impact way to do that.
> 
> I would love discussion on this, if others would find this valuable, I will
> happily make a JIRA.
> 
> John



Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Charles Givre
+2
I really like this idea.  
—C

> On May 25, 2016, at 08:52, Jim Scott <jsc...@maprtech.com> wrote:
> 
> +1
> 
> On Wed, May 25, 2016 at 7:05 AM, John Omernik <j...@omernik.com> wrote:
> 
>> Prior to opening a JIRA on this, I was curious what the community thought.
>>  I'd like to have a setting for workspaces that would indicate "hidden".
>> (Defaulting to false if not specified to not break any already implemented
>> workspace definitions)
>> 
>> For example:
>> 
>> "workspaces" {
>>   "dev": {
>>   "location": "/mydev",
>>   "writable": true,
>>   "defaultInputFormat": null,
>>   "hidden": true
>>  }
>> }
>> 
>> This would have the effect that when running "show schemas" this workspace
>> would not show up in the list.
>> 
>> Reasoning:  When organizing a large enterprise data
>> lake/ocean/cistern/swamp, limited "functional" options provided to the user
>> are better then "all" the options.   For example, as an administrator, I
>> may want to define workspaces to help clarify ETL processes, or service
>> loads that if the user HAS filesystem access they CAN access, however, they
>> will never want to, instead, the user would focused on cleaned/enriched
>> data.  My users would rarely use the "cp" plugin, however, I don't want to
>> eliminate it.  Basically, it doesn't show in show schema, but it can still
>> be used both directly in queries, and through the use command.
>> 
>> Another example: I create home schemas based on a home directory of every
>> user.  Users's will know it's there, and can easily access it, however,
>> showing up in "show schemas" doesn't provide value, and just clutters the
>> data returned in the response.  I want to attempt to provide a clean
>> interface and depiction of valuable schemas to my user via workspaces, and
>> this small flag, I believe would be a low impact way to do that.
>> 
>> I would love discussion on this, if others would find this valuable, I will
>> happily make a JIRA.
>> 
>> John
>> 
> 
> 
> 
> -- 
> *Jim Scott*
> Director, Enterprise Strategy & Architecture
> +1 (347) 746-9281
> @kingmesal <https://twitter.com/kingmesal>
> 
> <http://www.mapr.com/>
> [image: MapR Technologies] <http://www.mapr.com>
> 
> Now Available - Free Hadoop On-Demand Training
> <http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available>



Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Jim Scott
+1

On Wed, May 25, 2016 at 7:05 AM, John Omernik <j...@omernik.com> wrote:

> Prior to opening a JIRA on this, I was curious what the community thought.
>   I'd like to have a setting for workspaces that would indicate "hidden".
>  (Defaulting to false if not specified to not break any already implemented
> workspace definitions)
>
> For example:
>
> "workspaces" {
>"dev": {
>"location": "/mydev",
>"writable": true,
>"defaultInputFormat": null,
>"hidden": true
>   }
>  }
>
> This would have the effect that when running "show schemas" this workspace
> would not show up in the list.
>
> Reasoning:  When organizing a large enterprise data
> lake/ocean/cistern/swamp, limited "functional" options provided to the user
> are better then "all" the options.   For example, as an administrator, I
> may want to define workspaces to help clarify ETL processes, or service
> loads that if the user HAS filesystem access they CAN access, however, they
> will never want to, instead, the user would focused on cleaned/enriched
> data.  My users would rarely use the "cp" plugin, however, I don't want to
> eliminate it.  Basically, it doesn't show in show schema, but it can still
> be used both directly in queries, and through the use command.
>
> Another example: I create home schemas based on a home directory of every
> user.  Users's will know it's there, and can easily access it, however,
> showing up in "show schemas" doesn't provide value, and just clutters the
> data returned in the response.  I want to attempt to provide a clean
> interface and depiction of valuable schemas to my user via workspaces, and
> this small flag, I believe would be a low impact way to do that.
>
> I would love discussion on this, if others would find this valuable, I will
> happily make a JIRA.
>
> John
>



-- 
*Jim Scott*
Director, Enterprise Strategy & Architecture
+1 (347) 746-9281
@kingmesal <https://twitter.com/kingmesal>

<http://www.mapr.com/>
[image: MapR Technologies] <http://www.mapr.com>

Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available>


Discussion - "Hidden" Workspaces

2016-05-25 Thread John Omernik
Prior to opening a JIRA on this, I was curious what the community thought.
  I'd like to have a setting for workspaces that would indicate "hidden".
 (Defaulting to false if not specified to not break any already implemented
workspace definitions)

For example:

"workspaces" {
   "dev": {
   "location": "/mydev",
   "writable": true,
   "defaultInputFormat": null,
   "hidden": true
  }
 }

This would have the effect that when running "show schemas" this workspace
would not show up in the list.

Reasoning:  When organizing a large enterprise data
lake/ocean/cistern/swamp, limited "functional" options provided to the user
are better then "all" the options.   For example, as an administrator, I
may want to define workspaces to help clarify ETL processes, or service
loads that if the user HAS filesystem access they CAN access, however, they
will never want to, instead, the user would focused on cleaned/enriched
data.  My users would rarely use the "cp" plugin, however, I don't want to
eliminate it.  Basically, it doesn't show in show schema, but it can still
be used both directly in queries, and through the use command.

Another example: I create home schemas based on a home directory of every
user.  Users's will know it's there, and can easily access it, however,
showing up in "show schemas" doesn't provide value, and just clutters the
data returned in the response.  I want to attempt to provide a clean
interface and depiction of valuable schemas to my user via workspaces, and
this small flag, I believe would be a low impact way to do that.

I would love discussion on this, if others would find this valuable, I will
happily make a JIRA.

John


Re: workspaces

2016-05-13 Thread Andries Engelbrecht
Alternatively you can just create the storage plugin json file, delete the old 
one and post the new one using the REST API.

See
https://drill.apache.org/docs/rest-api/#storage 
<https://drill.apache.org/docs/rest-api/#storage>



> On May 13, 2016, at 10:05 AM, Vince Gonzalez <vince.gonza...@gmail.com> wrote:
> 
> I have used a pipeline involving jq and curl to modify storage plugins via
> the rest interface.  The one below adds a workspace to the dfs plugin:
> 
> curl -s localhost:8047/storage.json | jq '.[] | select(.name == "dfs")
> | .config.workspaces |= . + { "nypdmvc": { "location":
> "/Users/vince/data/nyc/nypdmvc", "writable": true,
> "defaultInputFormat": null}  }' | curl -s -X POST -H "Content-Type:
> application/json" -d @- http://localhost:8047/storage/dfs.json 
> <http://localhost:8047/storage/dfs.json>
> 
> Note that this won't work as is if you have authentication enabled.
> 
> On Friday, May 13, 2016, Odin Guillermo Caudillo Gallegos <
> odin.guille...@gmail.com <mailto:odin.guille...@gmail.com>> wrote:
> 
>> I have the restriction to not configure it via web console, so is there a
>> way to configure them on the terminal?
>> Cause in embed mode, i only create the files on the /tmp/ directory via
>> terminal, also on the drill-override.conf file i use another path for the
>> plugins (with sys.store.provider.local.path)
>> 
>> Thanks.
>> 
>> 2016-05-13 11:33 GMT-05:00 Andries Engelbrecht <aengelbre...@maprtech.com 
>> <mailto:aengelbre...@maprtech.com>
>> <javascript:;>>:
>> 
>>> You should start drill in distributed mode first and then configure the
>>> storage plugins.
>>> If you configure the storage plugins in embedded mode the information is
>>> stored in the tmp space instead of registered with ZK for the cluster to
>>> use.
>>> 
>>> --Andries
>>> 
>>>> On May 13, 2016, at 9:08 AM, Odin Guillermo Caudillo Gallegos <
>>> odin.guille...@gmail.com <javascript:;>> wrote:
>>>> 
>>>> The plugins are working fine in the embbed mode, but when i start the
>>>> drillbit on each server and connect via drill-conf i don't see them.
>>>> Do i need to configure another parameter apart from the zookeeper
>> servers
>>>> in the drill-override.conf file?
>>>> 
>>>> 2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <
>>> aengelbre...@maprtech.com <javascript:;>>:
>>>> 
>>>>> If Drill was correctly installed in distributed mode the storage
>> plugin
>>>>> and workspaces will be used by the Drill cluster.
>>>>> 
>>>>> Make sure the plugin and workspace was correctly configured and
>>> accepted.
>>>>> 
>>>>> Are you using the WebUI or REST to configure the storage plugins?
>>>>> 
>>>>> --Andries
>>>>> 
>>>>>> On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
>>>>> odin.guille...@gmail.com <javascript:;>> wrote:
>>>>>> 
>>>>>> Is there a way to configure workspaces on a distributed installation?
>>>>>> Cause i only see the default plugin configuration but not the one
>> that
>>> i
>>>>>> created.
>>>>>> 
>>>>>> Thanks
>>>>> 
>>>>> 
>>> 
>>> 
>> 
> 
> 
> --



Re: workspaces

2016-05-13 Thread Odin Guillermo Caudillo Gallegos
Ok i'll give it a try, thanks!

2016-05-13 12:05 GMT-05:00 Vince Gonzalez <vince.gonza...@gmail.com>:

> I have used a pipeline involving jq and curl to modify storage plugins via
> the rest interface.  The one below adds a workspace to the dfs plugin:
>
> curl -s localhost:8047/storage.json | jq '.[] | select(.name == "dfs")
> | .config.workspaces |= . + { "nypdmvc": { "location":
> "/Users/vince/data/nyc/nypdmvc", "writable": true,
> "defaultInputFormat": null}  }' | curl -s -X POST -H "Content-Type:
> application/json" -d @- http://localhost:8047/storage/dfs.json
>
> Note that this won't work as is if you have authentication enabled.
>
> On Friday, May 13, 2016, Odin Guillermo Caudillo Gallegos <
> odin.guille...@gmail.com> wrote:
>
> > I have the restriction to not configure it via web console, so is there a
> > way to configure them on the terminal?
> > Cause in embed mode, i only create the files on the /tmp/ directory via
> > terminal, also on the drill-override.conf file i use another path for the
> > plugins (with sys.store.provider.local.path)
> >
> > Thanks.
> >
> > 2016-05-13 11:33 GMT-05:00 Andries Engelbrecht <
> aengelbre...@maprtech.com
> > <javascript:;>>:
> >
> > > You should start drill in distributed mode first and then configure the
> > > storage plugins.
> > > If you configure the storage plugins in embedded mode the information
> is
> > > stored in the tmp space instead of registered with ZK for the cluster
> to
> > > use.
> > >
> > > --Andries
> > >
> > > > On May 13, 2016, at 9:08 AM, Odin Guillermo Caudillo Gallegos <
> > > odin.guille...@gmail.com <javascript:;>> wrote:
> > > >
> > > > The plugins are working fine in the embbed mode, but when i start the
> > > > drillbit on each server and connect via drill-conf i don't see them.
> > > > Do i need to configure another parameter apart from the zookeeper
> > servers
> > > > in the drill-override.conf file?
> > > >
> > > > 2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <
> > > aengelbre...@maprtech.com <javascript:;>>:
> > > >
> > > >> If Drill was correctly installed in distributed mode the storage
> > plugin
> > > >> and workspaces will be used by the Drill cluster.
> > > >>
> > > >> Make sure the plugin and workspace was correctly configured and
> > > accepted.
> > > >>
> > > >> Are you using the WebUI or REST to configure the storage plugins?
> > > >>
> > > >> --Andries
> > > >>
> > > >>> On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
> > > >> odin.guille...@gmail.com <javascript:;>> wrote:
> > > >>>
> > > >>> Is there a way to configure workspaces on a distributed
> installation?
> > > >>> Cause i only see the default plugin configuration but not the one
> > that
> > > i
> > > >>> created.
> > > >>>
> > > >>> Thanks
> > > >>
> > > >>
> > >
> > >
> >
>
>
> --
>


Re: workspaces

2016-05-13 Thread Vince Gonzalez
I have used a pipeline involving jq and curl to modify storage plugins via
the rest interface.  The one below adds a workspace to the dfs plugin:

curl -s localhost:8047/storage.json | jq '.[] | select(.name == "dfs")
| .config.workspaces |= . + { "nypdmvc": { "location":
"/Users/vince/data/nyc/nypdmvc", "writable": true,
"defaultInputFormat": null}  }' | curl -s -X POST -H "Content-Type:
application/json" -d @- http://localhost:8047/storage/dfs.json

Note that this won't work as is if you have authentication enabled.

On Friday, May 13, 2016, Odin Guillermo Caudillo Gallegos <
odin.guille...@gmail.com> wrote:

> I have the restriction to not configure it via web console, so is there a
> way to configure them on the terminal?
> Cause in embed mode, i only create the files on the /tmp/ directory via
> terminal, also on the drill-override.conf file i use another path for the
> plugins (with sys.store.provider.local.path)
>
> Thanks.
>
> 2016-05-13 11:33 GMT-05:00 Andries Engelbrecht <aengelbre...@maprtech.com
> <javascript:;>>:
>
> > You should start drill in distributed mode first and then configure the
> > storage plugins.
> > If you configure the storage plugins in embedded mode the information is
> > stored in the tmp space instead of registered with ZK for the cluster to
> > use.
> >
> > --Andries
> >
> > > On May 13, 2016, at 9:08 AM, Odin Guillermo Caudillo Gallegos <
> > odin.guille...@gmail.com <javascript:;>> wrote:
> > >
> > > The plugins are working fine in the embbed mode, but when i start the
> > > drillbit on each server and connect via drill-conf i don't see them.
> > > Do i need to configure another parameter apart from the zookeeper
> servers
> > > in the drill-override.conf file?
> > >
> > > 2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <
> > aengelbre...@maprtech.com <javascript:;>>:
> > >
> > >> If Drill was correctly installed in distributed mode the storage
> plugin
> > >> and workspaces will be used by the Drill cluster.
> > >>
> > >> Make sure the plugin and workspace was correctly configured and
> > accepted.
> > >>
> > >> Are you using the WebUI or REST to configure the storage plugins?
> > >>
> > >> --Andries
> > >>
> > >>> On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
> > >> odin.guille...@gmail.com <javascript:;>> wrote:
> > >>>
> > >>> Is there a way to configure workspaces on a distributed installation?
> > >>> Cause i only see the default plugin configuration but not the one
> that
> > i
> > >>> created.
> > >>>
> > >>> Thanks
> > >>
> > >>
> >
> >
>


--


Re: workspaces

2016-05-13 Thread Odin Guillermo Caudillo Gallegos
I have the restriction to not configure it via web console, so is there a
way to configure them on the terminal?
Cause in embed mode, i only create the files on the /tmp/ directory via
terminal, also on the drill-override.conf file i use another path for the
plugins (with sys.store.provider.local.path)

Thanks.

2016-05-13 11:33 GMT-05:00 Andries Engelbrecht <aengelbre...@maprtech.com>:

> You should start drill in distributed mode first and then configure the
> storage plugins.
> If you configure the storage plugins in embedded mode the information is
> stored in the tmp space instead of registered with ZK for the cluster to
> use.
>
> --Andries
>
> > On May 13, 2016, at 9:08 AM, Odin Guillermo Caudillo Gallegos <
> odin.guille...@gmail.com> wrote:
> >
> > The plugins are working fine in the embbed mode, but when i start the
> > drillbit on each server and connect via drill-conf i don't see them.
> > Do i need to configure another parameter apart from the zookeeper servers
> > in the drill-override.conf file?
> >
> > 2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <
> aengelbre...@maprtech.com>:
> >
> >> If Drill was correctly installed in distributed mode the storage plugin
> >> and workspaces will be used by the Drill cluster.
> >>
> >> Make sure the plugin and workspace was correctly configured and
> accepted.
> >>
> >> Are you using the WebUI or REST to configure the storage plugins?
> >>
> >> --Andries
> >>
> >>> On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
> >> odin.guille...@gmail.com> wrote:
> >>>
> >>> Is there a way to configure workspaces on a distributed installation?
> >>> Cause i only see the default plugin configuration but not the one that
> i
> >>> created.
> >>>
> >>> Thanks
> >>
> >>
>
>


Re: workspaces

2016-05-13 Thread Abdel Hakim Deneche
I believe Drill stores storage plugins in different places when running in
embedded mode vs distributed mode. Embedded mode uses local disk and
distributed mode uses Zookeeper.

On Fri, May 13, 2016 at 9:08 AM, Odin Guillermo Caudillo Gallegos <
odin.guille...@gmail.com> wrote:

> The plugins are working fine in the embbed mode, but when i start the
> drillbit on each server and connect via drill-conf i don't see them.
> Do i need to configure another parameter apart from the zookeeper servers
> in the drill-override.conf file?
>
> 2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <aengelbre...@maprtech.com
> >:
>
> > If Drill was correctly installed in distributed mode the storage plugin
> > and workspaces will be used by the Drill cluster.
> >
> > Make sure the plugin and workspace was correctly configured and accepted.
> >
> > Are you using the WebUI or REST to configure the storage plugins?
> >
> > --Andries
> >
> > > On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
> > odin.guille...@gmail.com> wrote:
> > >
> > > Is there a way to configure workspaces on a distributed installation?
> > > Cause i only see the default plugin configuration but not the one that
> i
> > > created.
> > >
> > > Thanks
> >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available>


Re: workspaces

2016-05-13 Thread Andries Engelbrecht
You should start drill in distributed mode first and then configure the storage 
plugins.
If you configure the storage plugins in embedded mode the information is stored 
in the tmp space instead of registered with ZK for the cluster to use.

--Andries

> On May 13, 2016, at 9:08 AM, Odin Guillermo Caudillo Gallegos 
> <odin.guille...@gmail.com> wrote:
> 
> The plugins are working fine in the embbed mode, but when i start the
> drillbit on each server and connect via drill-conf i don't see them.
> Do i need to configure another parameter apart from the zookeeper servers
> in the drill-override.conf file?
> 
> 2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <aengelbre...@maprtech.com>:
> 
>> If Drill was correctly installed in distributed mode the storage plugin
>> and workspaces will be used by the Drill cluster.
>> 
>> Make sure the plugin and workspace was correctly configured and accepted.
>> 
>> Are you using the WebUI or REST to configure the storage plugins?
>> 
>> --Andries
>> 
>>> On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
>> odin.guille...@gmail.com> wrote:
>>> 
>>> Is there a way to configure workspaces on a distributed installation?
>>> Cause i only see the default plugin configuration but not the one that i
>>> created.
>>> 
>>> Thanks
>> 
>> 



Re: workspaces

2016-05-13 Thread Odin Guillermo Caudillo Gallegos
The plugins are working fine in the embbed mode, but when i start the
drillbit on each server and connect via drill-conf i don't see them.
Do i need to configure another parameter apart from the zookeeper servers
in the drill-override.conf file?

2016-05-13 11:01 GMT-05:00 Andries Engelbrecht <aengelbre...@maprtech.com>:

> If Drill was correctly installed in distributed mode the storage plugin
> and workspaces will be used by the Drill cluster.
>
> Make sure the plugin and workspace was correctly configured and accepted.
>
> Are you using the WebUI or REST to configure the storage plugins?
>
> --Andries
>
> > On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos <
> odin.guille...@gmail.com> wrote:
> >
> > Is there a way to configure workspaces on a distributed installation?
> > Cause i only see the default plugin configuration but not the one that i
> > created.
> >
> > Thanks
>
>


Re: workspaces

2016-05-13 Thread Andries Engelbrecht
If Drill was correctly installed in distributed mode the storage plugin and 
workspaces will be used by the Drill cluster.

Make sure the plugin and workspace was correctly configured and accepted.

Are you using the WebUI or REST to configure the storage plugins?

--Andries

> On May 13, 2016, at 8:48 AM, Odin Guillermo Caudillo Gallegos 
> <odin.guille...@gmail.com> wrote:
> 
> Is there a way to configure workspaces on a distributed installation?
> Cause i only see the default plugin configuration but not the one that i
> created.
> 
> Thanks



workspaces

2016-05-13 Thread Odin Guillermo Caudillo Gallegos
Is there a way to configure workspaces on a distributed installation?
Cause i only see the default plugin configuration but not the one that i
created.

Thanks


Re: Create Workspaces vis Script

2015-10-09 Thread Hanifi Gunes
You should resubmit the whole plugin configuration in request body.
However, please note that REST API currently has no notion of user
privileges or roles.

On Fri, Oct 9, 2015 at 10:58 AM, John Omernik <j...@omernik.com> wrote:

> Thank you Abdel -
>
> Question, for "updating a storage plugin" do you have to resubmit the whole
> plugin as part of the API, or just the parts you want added or changed?
>
> Thanks!
>
> John
>
> On Fri, Oct 9, 2015 at 12:42 PM, Abdel Hakim Deneche <
> adene...@maprtech.com>
> wrote:
>
> > I think you can use the REST API to do so. Here is a link to a Google
> > document that explain the API:
> >
> >
> >
> https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit
> >
> >
> > On Fri, Oct 9, 2015 at 10:36 AM, John Omernik <j...@omernik.com> wrote:
> >
> > > Is there an easy way for a user with the proper privs to add workspaces
> > in
> > > Drill? I'd love to have a scenario where I add users to my cluster,
> and I
> > > create a home directory via MapR Volumes, set quotas, etc, and then
> auto
> > > create a workspace for the user to connect and play with data.
> > >
> > > I looked at the docs and didn't find any thing, this would be really
> > handy
> > > from an enterprise perspective.
> > >
> > > John
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > >
> >
>


Re: Create Workspaces vis Script

2015-10-09 Thread John Omernik
I thought Drilll 1.2 added some auth, I guess I mistook that... thanks for
the tip.

John

On Fri, Oct 9, 2015 at 1:06 PM, Hanifi Gunes <hgu...@maprtech.com> wrote:

> You should resubmit the whole plugin configuration in request body.
> However, please note that REST API currently has no notion of user
> privileges or roles.
>
> On Fri, Oct 9, 2015 at 10:58 AM, John Omernik <j...@omernik.com> wrote:
>
> > Thank you Abdel -
> >
> > Question, for "updating a storage plugin" do you have to resubmit the
> whole
> > plugin as part of the API, or just the parts you want added or changed?
> >
> > Thanks!
> >
> > John
> >
> > On Fri, Oct 9, 2015 at 12:42 PM, Abdel Hakim Deneche <
> > adene...@maprtech.com>
> > wrote:
> >
> > > I think you can use the REST API to do so. Here is a link to a Google
> > > document that explain the API:
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit
> > >
> > >
> > > On Fri, Oct 9, 2015 at 10:36 AM, John Omernik <j...@omernik.com>
> wrote:
> > >
> > > > Is there an easy way for a user with the proper privs to add
> workspaces
> > > in
> > > > Drill? I'd love to have a scenario where I add users to my cluster,
> > and I
> > > > create a home directory via MapR Volumes, set quotas, etc, and then
> > auto
> > > > create a workspace for the user to connect and play with data.
> > > >
> > > > I looked at the docs and didn't find any thing, this would be really
> > > handy
> > > > from an enterprise perspective.
> > > >
> > > > John
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > > >
> > >
> >
>


Re: Create Workspaces vis Script

2015-10-09 Thread John Omernik
Thank you Abdel -

Question, for "updating a storage plugin" do you have to resubmit the whole
plugin as part of the API, or just the parts you want added or changed?

Thanks!

John

On Fri, Oct 9, 2015 at 12:42 PM, Abdel Hakim Deneche <adene...@maprtech.com>
wrote:

> I think you can use the REST API to do so. Here is a link to a Google
> document that explain the API:
>
>
> https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit
>
>
> On Fri, Oct 9, 2015 at 10:36 AM, John Omernik <j...@omernik.com> wrote:
>
> > Is there an easy way for a user with the proper privs to add workspaces
> in
> > Drill? I'd love to have a scenario where I add users to my cluster, and I
> > create a home directory via MapR Volumes, set quotas, etc, and then auto
> > create a workspace for the user to connect and play with data.
> >
> > I looked at the docs and didn't find any thing, this would be really
> handy
> > from an enterprise perspective.
> >
> > John
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> >
>


Re: Create Workspaces vis Script

2015-10-09 Thread Hanifi Gunes
[1] does not seem in 1.2 yet but seems close.


1: https://issues.apache.org/jira/browse/DRILL-3201

On Fri, Oct 9, 2015 at 11:09 AM, John Omernik <j...@omernik.com> wrote:

> I thought Drilll 1.2 added some auth, I guess I mistook that... thanks for
> the tip.
>
> John
>
> On Fri, Oct 9, 2015 at 1:06 PM, Hanifi Gunes <hgu...@maprtech.com> wrote:
>
> > You should resubmit the whole plugin configuration in request body.
> > However, please note that REST API currently has no notion of user
> > privileges or roles.
> >
> > On Fri, Oct 9, 2015 at 10:58 AM, John Omernik <j...@omernik.com> wrote:
> >
> > > Thank you Abdel -
> > >
> > > Question, for "updating a storage plugin" do you have to resubmit the
> > whole
> > > plugin as part of the API, or just the parts you want added or changed?
> > >
> > > Thanks!
> > >
> > > John
> > >
> > > On Fri, Oct 9, 2015 at 12:42 PM, Abdel Hakim Deneche <
> > > adene...@maprtech.com>
> > > wrote:
> > >
> > > > I think you can use the REST API to do so. Here is a link to a Google
> > > > document that explain the API:
> > > >
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1mRsuWk4Dpt6ts-jQ6ke3bB30PIwanRiCPfGxRwZEQME/edit
> > > >
> > > >
> > > > On Fri, Oct 9, 2015 at 10:36 AM, John Omernik <j...@omernik.com>
> > wrote:
> > > >
> > > > > Is there an easy way for a user with the proper privs to add
> > workspaces
> > > > in
> > > > > Drill? I'd love to have a scenario where I add users to my cluster,
> > > and I
> > > > > create a home directory via MapR Volumes, set quotas, etc, and then
> > > auto
> > > > > create a workspace for the user to connect and play with data.
> > > > >
> > > > > I looked at the docs and didn't find any thing, this would be
> really
> > > > handy
> > > > > from an enterprise perspective.
> > > > >
> > > > > John
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Abdelhakim Deneche
> > > >
> > > > Software Engineer
> > > >
> > > >   <http://www.mapr.com/>
> > > >
> > > >
> > > > Now Available - Free Hadoop On-Demand Training
> > > > <
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email_medium=Signature_campaign=Free%20available
> > > > >
> > > >
> > >
> >
>