Re: [DISCUSS]: Masking Creds in Query Plans

2020-04-18 Thread Arina Ielchiieva
Agree, that we should not display sensitive data, like passwords, I would say 
the best option is to mask it during output.

Kind regards,
Arina

> On Apr 17, 2020, at 5:34 PM, Charles Givre  wrote:
> 
> Hello all, 
> I was thinking about this, if a user were to execute an EXPLAIN PLAN FOR 
> query, they get a lot of information about the storage plugin, including in 
> some cases creds.
> The example below shows a query plan for the JDBC storage plugin.   As you 
> can see, the user creds are right there 
> 
> I'm wondering would it be advisable or possible to mask the creds in query 
> plans so that users can't access this information?  If masking it isn't an 
> option, is there some other way to prevent users from seeing this 
> information?  In a multi-tenant environment, it seems like a rather large 
> security hole. 
> Thanks,
> -- C
> 
> 
> {
>  "head" : {
>"version" : 1,
>"generator" : {
>  "type" : "ExplainHandler",
>  "info" : ""
>},
>"type" : "APACHE_DRILL_PHYSICAL",
>"options" : [ ],
>"queue" : 0,
>"hasResourcePlan" : false,
>"resultMode" : "EXEC"
>  },
>  "graph" : [ {
>"pop" : "jdbc-scan",
>"@id" : 5,
>"sql" : "SELECT *\nFROM `stats`.`batting`",
>"columns" : [ "`playerID`", "`yearID`", "`stint`", "`teamID`", "`lgID`", 
> "`G`", "`AB`", "`R`", "`H`", "`2B`", "`3B`", "`HR`", "`RBI`", "`SB`", "`CS`", 
> "`BB`", "`SO`", "`IBB`", "`HBP`", "`SH`", "`SF`", "`GIDP`" ],
>"config" : {
>  "type" : "jdbc",
>  "driver" : "com.mysql.cj.jdbc.Driver",
>  "url" : "jdbc:mysql://localhost:3306/?serverTimezone=EST5EDT",
>  "username" : "",
>  "password" : "",
>  "caseInsensitiveTableNames" : false,
>  "sourceParameters" : { },
>  "enabled" : true
>},
>"userName" : "",
>"cost" : {
>  "memoryCost" : 1.6777216E7,
>  "outputRowCount" : 100.0
>}
>  }, {
>"pop" : "limit",
>"@id" : 4,
>"child" : 5,
>"first" : 0,
>"last" : 10,
>"initialAllocation" : 100,
>"maxAllocation" : 100,
>"cost" : {
>  "memoryCost" : 1.6777216E7,
>  "outputRowCount" : 10.0
>}
>  }, {
>"pop" : "limit",
>"@id" : 3,
> 
> 



Re: [DISCUSS]: Masking Creds in Query Plans

2020-04-17 Thread Paul Rogers
Hi Charles,

Excellent point. The problem is deeper. Drill serializes plugin configs in the 
query plan which it sends to each worker (Drillbit.) Why? To avoid race 
conditions if you start a query then change the plugin config and thus 
different nodes see different versions of the config.

Masking can't happen in the execution plan or the plan won't work. (I hope your 
password is not actually "***".) So, masking would have to happen in logs 
and in the EXPLAIN PLAN FOR. This would, in turn, require that we have code 
that understands each config well enough to make a copy of the config with the 
credentials masked so we can then serialize the copied plan to JSON. (Or, we'd 
have to edit the JSON after generated.) Both are pretty ugly and not very 
secure.

What we need is some kind of "vault" interface: a config which is a key into a 
vault where Drill itself has been given the key, and the vault returns the 
actual credential value. As a security guy yourself, what would you recommend 
as our target? Should we create a generic API? Is there some system common 
enough on Hadoop systems that we should target that as our reference 
implementation? Also, can you perhaps file a JIRA ticket for this issue?

Thanks,
- Paul

 

On Friday, April 17, 2020, 7:34:32 AM PDT, Charles Givre  
wrote:  
 
 Hello all, 
I was thinking about this, if a user were to execute an EXPLAIN PLAN FOR query, 
they get a lot of information about the storage plugin, including in some cases 
creds.
The example below shows a query plan for the JDBC storage plugin.  As you can 
see, the user creds are right there 

I'm wondering would it be advisable or possible to mask the creds in query 
plans so that users can't access this information?  If masking it isn't an 
option, is there some other way to prevent users from seeing this information?  
In a multi-tenant environment, it seems like a rather large security hole. 
Thanks,
-- C


{
  "head" : {
    "version" : 1,
    "generator" : {
      "type" : "ExplainHandler",
      "info" : ""
    },
    "type" : "APACHE_DRILL_PHYSICAL",
    "options" : [ ],
    "queue" : 0,
    "hasResourcePlan" : false,
    "resultMode" : "EXEC"
  },
  "graph" : [ {
    "pop" : "jdbc-scan",
    "@id" : 5,
    "sql" : "SELECT *\nFROM `stats`.`batting`",
    "columns" : [ "`playerID`", "`yearID`", "`stint`", "`teamID`", "`lgID`", 
"`G`", "`AB`", "`R`", "`H`", "`2B`", "`3B`", "`HR`", "`RBI`", "`SB`", "`CS`", 
"`BB`", "`SO`", "`IBB`", "`HBP`", "`SH`", "`SF`", "`GIDP`" ],
    "config" : {
      "type" : "jdbc",
      "driver" : "com.mysql.cj.jdbc.Driver",
      "url" : "jdbc:mysql://localhost:3306/?serverTimezone=EST5EDT",
      "username" : "",
      "password" : "",
      "caseInsensitiveTableNames" : false,
      "sourceParameters" : { },
      "enabled" : true
    },
    "userName" : "",
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 100.0
    }
  }, {
    "pop" : "limit",
    "@id" : 4,
    "child" : 5,
    "first" : 0,
    "last" : 10,
    "initialAllocation" : 100,
    "maxAllocation" : 100,
    "cost" : {
      "memoryCost" : 1.6777216E7,
      "outputRowCount" : 10.0
    }
  }, {
    "pop" : "limit",
    "@id" : 3,

  

[DISCUSS]: Masking Creds in Query Plans

2020-04-17 Thread Charles Givre
Hello all, 
I was thinking about this, if a user were to execute an EXPLAIN PLAN FOR query, 
they get a lot of information about the storage plugin, including in some cases 
creds.
The example below shows a query plan for the JDBC storage plugin.   As you can 
see, the user creds are right there 

I'm wondering would it be advisable or possible to mask the creds in query 
plans so that users can't access this information?  If masking it isn't an 
option, is there some other way to prevent users from seeing this information?  
In a multi-tenant environment, it seems like a rather large security hole. 
Thanks,
-- C


{
  "head" : {
"version" : 1,
"generator" : {
  "type" : "ExplainHandler",
  "info" : ""
},
"type" : "APACHE_DRILL_PHYSICAL",
"options" : [ ],
"queue" : 0,
"hasResourcePlan" : false,
"resultMode" : "EXEC"
  },
  "graph" : [ {
"pop" : "jdbc-scan",
"@id" : 5,
"sql" : "SELECT *\nFROM `stats`.`batting`",
"columns" : [ "`playerID`", "`yearID`", "`stint`", "`teamID`", "`lgID`", 
"`G`", "`AB`", "`R`", "`H`", "`2B`", "`3B`", "`HR`", "`RBI`", "`SB`", "`CS`", 
"`BB`", "`SO`", "`IBB`", "`HBP`", "`SH`", "`SF`", "`GIDP`" ],
"config" : {
  "type" : "jdbc",
  "driver" : "com.mysql.cj.jdbc.Driver",
  "url" : "jdbc:mysql://localhost:3306/?serverTimezone=EST5EDT",
  "username" : "",
  "password" : "",
  "caseInsensitiveTableNames" : false,
  "sourceParameters" : { },
  "enabled" : true
},
"userName" : "",
"cost" : {
  "memoryCost" : 1.6777216E7,
  "outputRowCount" : 100.0
}
  }, {
"pop" : "limit",
"@id" : 4,
"child" : 5,
"first" : 0,
"last" : 10,
"initialAllocation" : 100,
"maxAllocation" : 100,
"cost" : {
  "memoryCost" : 1.6777216E7,
  "outputRowCount" : 10.0
}
  }, {
"pop" : "limit",
"@id" : 3,