jihoonson commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r610919733
##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +322,31 @@ As an alternative to using the basic metadata
authenticator, as shown in the pre
Congratulations, you have configured permissions for user-assigned roles in
Druid!
+
+
+## Druid security trust model
+Within Druid's trust model there users can have different authorization levels:
+- Users with resource write permissions can are allowed to anything that the
druid process can do.
+- Authenticated read only users can execute queries against resources to which
they have permissions.
+- An authenticated user without any permissions is allowed to execute queries
that don't require access to a resource.
+
+Additionally, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes have the same access to the local files granted to the
specified system user running the process.
+2. The Druid ingestion system can create new processes to execute tasks. Those
tasks inherit the user of their parent process. This means that any user
authorized to submit an ingestion task can use the ingestion task permissions
to read or write any local files or external resources that the Druid process
has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users
because they can act as the Druid process.
Review comment:
Here and [this
figure](https://github.com/apache/druid/pull/11016/files#diff-3e8eb443238c8a04b52e7691033cfa2b8bd133611434b548c5cd9eaa3a8a72c3R189)
talk about the permission to submit ingestion tasks. This seems ambiguous to
me. Maybe it would be better to say the `DATASOURCE WRITE` permission instead.
##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
-->
-## Overview
+
+This document provides an overview of Apache Druid security features,
configuration instructions, and some best practices to secure Druid.
By default, security features in Druid are disabled, which simplifies the
initial deployment experience. However, security features must be configured in
a production deployment. These features include TLS, authentication, and
authorization.
-To implement Druid security, you configure authenticators and authorizers.
Authenticators control the way user identities are verified, while authorizers
map the authenticated users (via user roles) to the datasources they are
permitted to access. Consequently, implementing Druid security also involves
considering your datasource scheme, since that scheme represents the
granularity at which data access permissions are allocated.
-The following graphic depicts the course of request through the authentication
process:
+## Best practices
+The following recommendations apply to the Druid cluster setup:
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+ > **WARNING!** \
+ Druid administrators have the same OS permissions as the Unix user account
running Druid. See [Authentication and authorization
model](security-user-auth.md#authentication-and-authorization-model). If the
Druid process is running under the OS root user account, then Druid
administrators can read or write all files that the root account has access to,
including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and
other environments that can be accessed by untrusted networks.
+* Enable authorization and do not expose the Druid Console without
authorization enabled. If authorization is not enabled, any user that has
access to the web console has the same privileges as the operating system user
that runs the Druid Console process.
+* Grant users the minimum permissions necessary to perform their functions.
For instance, do not allow users who only need to query data to write to data
sources or view state.
+* * Disable JavaScript, as noted in the [Security
section](https://druid.apache.org/docs/latest/development/javascript.html#security)
of the JavaScript guide.
+
+The following recommendations apply the network where Druid runs:
+* Enable TLS to encrypt communication within the cluster.
+* Use an API gateway to:
+ - Restrict access from untrusted networks
+ - Create an allow list of specific APIs that your users need to access
+ - Implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose
Druid services and ports specifically required for your use case. For example,
only expose Broker ports to downstream applications that execute queries. You
can limit access to a specific IP address or IP range to further tighten and
enhance security.
+
+The following recommendation applies to Druids authorization and
authentication model:
+* Only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's
trust model assumes those users have the same privileges as the operating
system user that runs the Druid Console process.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions
to highly-trusted users. These permissions allows users to access resources on
behalf of the Druid server process regardless of the datasource.
Review comment:
I think `CONFIG WRITE` should be an adminitrator-ish permission too as
you can update dynamic system configs such as lookups with it. I'm not sure
about `STATE READ` though. What bad things can happen when a malicious user
have the `STATE READ` permission?
##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
-->
-## Overview
+
+This document provides an overview of Apache Druid security features,
configuration instructions, and some best practices to secure Druid.
By default, security features in Druid are disabled, which simplifies the
initial deployment experience. However, security features must be configured in
a production deployment. These features include TLS, authentication, and
authorization.
-To implement Druid security, you configure authenticators and authorizers.
Authenticators control the way user identities are verified, while authorizers
map the authenticated users (via user roles) to the datasources they are
permitted to access. Consequently, implementing Druid security also involves
considering your datasource scheme, since that scheme represents the
granularity at which data access permissions are allocated.
-The following graphic depicts the course of request through the authentication
process:
+## Best practices
Review comment:
Suggest linking these configs somewhere in this section.
- Ingestion security configs:
https://github.com/apache/druid/blob/master/docs/configuration/index.md#ingestion-security-configuration
- JDBC connections security configs:
https://github.com/apache/druid/blob/master/docs/configuration/index.md#jdbc-connections-to-external-databases
##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +322,31 @@ As an alternative to using the basic metadata
authenticator, as shown in the pre
Congratulations, you have configured permissions for user-assigned roles in
Druid!
+
+
+## Druid security trust model
+Within Druid's trust model there users can have different authorization levels:
+- Users with resource write permissions can are allowed to anything that the
druid process can do.
Review comment:
Should `can` also be eliminated? Like, `Users with resource write
permissions are allowed to do`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]