Sorabh Hamirwasia created DRILL-5501:
----------------------------------------

             Summary: Improve the negotiation of max_wrapped_size for 
encryption.
                 Key: DRILL-5501
                 URL: https://issues.apache.org/jira/browse/DRILL-5501
             Project: Apache Drill
          Issue Type: Improvement
            Reporter: Sorabh Hamirwasia
            Assignee: Sorabh Hamirwasia
             Fix For: Future


With 1.11 Drill will have the support for encryption using SASL framework. As 
part of encryption negotiation SASL exposes bunch of parameters like QOP, 
strength, maxbuffer and rawsendsize. The details on these parameters can be 
found 
[here|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/javax/security/sasl/Sasl.java#Sasl].
 This JIRA specifically is in reference to _maxbuffer_ and _rawsendsize_ 
parameter.

*rawsendsize* is the maximum plain text size which application should pass to 
wrap function of a mechanism to produce an encoded buffer not exceeding 
*maxbuffer* size. It is retrieved by application after negotiation is done for 
_maxbuffer_.
*maxbuffer* parameter is the maximum received buffer size (encoded) that 
client/server side agrees to receive. It is configurable in Drill using
*encryption.sasl.max_wrapped_size* configuration for client and bit to bit 
connections. This parameter is global for all the supported mechanisms 
configured. For an optimization this configuration is also used by each 
connection SaslDecryptionHandler to create a pre-allocated buffer of that size. 
Since each encrypted chunk will not exceed this configured value hence we can 
re-use the same buffer each time to copy the encrypted chunk from wire and 
decrypt it, instead of creating a buffer each time a message is received. Since 
currently GSSAPI (or Kerberos) is the only available mechanism which is 
supported by Drill with encryption so having this global parameter is fine. But 
in future if more mechanisms are supported then it can be a issue, if the 
mechanism doesn't support negotiation of this parameter instead just defines 
internally to be a fixed value.

As per [SASL RFC|https://tools.ietf.org/html/rfc4422#section-3.7]:
_The maximum size that each side expects is fixed by the mechanism, either 
through negotiation or by its specification_

This means this parameter can either be negotiated or can be fixed by 
mechanisms. So in a case let say the parameter is configured to a value of 1MB 
and there are 2 mechanisms which are configured {kerberos, custom}. custom 
mechanism has defined fixed value of this parameter to be 64K whereas kerberos 
can negotiate for 1MB size (since max allowed by GSSAPI is 16MB). Now each 
connection will have a pre-define buffer of 1MB allocated in it's 
SaslDecryptionHandler. For connection using custom mechanism there is wastage 
in memory since the maximum encoded buffer it will ever receive is 64K. To 
resolve this issue following solution is proposed:

1) Use the drill configuration _max_wrapped_size_ as the global value for  
_maxbuffer_ parameter for all the mechanisms which support negotiation. For 
mechanisms which has it's own pre-defined value of _maxbuffer_ the configured 
value will be ignored.
2) In Drill we implement a factory like KerberosFactory / Plain Factory for all 
the supported mechanisms. Each factory will be aware of the behavior of it's 
underlying supported mechanism and use the configured value accordingly i.e. 
with all the bounds checking / ignoring it totally as well. For example: 
* Kerberos factory will know that it supports negotiation of _maxbuffer_ upto 
max value of 16MB. So it can use the Drill configured value and perform the 
bound check before setting it in SASL layer (i.e. when saslClient/saslServer 
are created for negotiation)
* Custom factory will ignore this configuration value since it's underlying 
mechanism has fixed defined value of _maxbuffer_ and will use that.

3) Once the Sasl layer is created the negotiation for the connection will 
happen based on chosen mechanism. After negotiation is completed Drill can 
retrieve the value of *maxbuffer* and corresponding *rawsendsize*  using 
saslClient/saslServer.getNegotiatedProperty() and set that in the 
EncryptionContext instance of that connection.
4) I didn't found that the value of *maxbuffer* parameter is updated based on 
negotiation internally in mechanism implementation (looked for GSSAPI) .So it 
looks to me mechanism expects application to pass correct value within bounds. 
Hence the need to check for bounds of configured value in corresponding factory 
is needed (as mentioned in step 2), so that when the parameter value is 
retrieved after negotiation the connection get's the correct value in it's 
EncryptionContext.
5) Later when security handlers are added as part of each connection, it's 
corresponding SaslDecryptionHandler will use the buffer size in 
EncryptionContext (which was updated after negotiation) to allocate the buffer.

The above solution will resolve the issue seen in example discussed before. As 
now kerberos mechanism will negotiate for 1MB MaxEncoded buffer size since it's 
within it's max bound of 16MB whereas custom mechanism will ignore the 
configured value and use fixed size of 64K defined by mechanism. Later when 
Sasl negotiation is completed, the connection using Kerberos will set the 
EncryptionContext.maxWrappedSize as 1MB and connection using custom mechanism 
will set it's EncryptionContext.maxWrappedSize as 64K. And inside 
SaslDecryptionHandler of each connection corresponding buffer of that size will 
be allocated and there won't be any wastage of memory.

With the above approach we will achieve below:
* Have global config value for mechanism which have negotiating capability.
* We can have custom mechanism which doesn't support negotiating maxbuffer 
value and not waste memory in SaslDecryptionHandler but still use the 
optimization.
* Different connection using different mechanisms will have different footprint 
of memory allocated as part of the re-usable buffer.
** For this we can have a counter for total memory in-use by 
SaslDecryptionHandler re-usable buffer for each connection type 
(user/control/data) across all such connections.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to