[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption

2017-05-20 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4335:
-
Labels: ready-to-commit security  (was: security)

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: ready-to-commit, security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5508) Flow control in Drill RPC layer

2017-05-12 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5508:


 Summary: Flow control in Drill RPC layer
 Key: DRILL-5508
 URL: https://issues.apache.org/jira/browse/DRILL-5508
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - RPC
Reporter: Sorabh Hamirwasia


Drill uses Netty to implement it's RPC layer. Netty internally has 
_ChannelOutboudBuffer_ where it stores all the data sent by application when 
TCP send buffer is full. Netty also has a concept of 
_WRITE_BUFFER_HIGH_WATER_MARK_ and _LOW_BUFFER_HIGH_WATER_MARK_ which are 
configurable and help to know when the send buffer is full or when it can 
accept more data. The channel writability is turned on/off based on these 
parameters which application can use to make smart decision. More information 
can be found 
[here|https://netty.io/4.1/api/io/netty/channel/WriteBufferWaterMark.html]. All 
these together can help to implement flow control in Drill. Today in Drill the 
only flow control we have is based on number of batches sent (which is 3) 
without ack. But that doesn't consider how much data is transferred as part of 
those batches. Without using the proper flow control based on water marks Drill 
is just overwhelming the pipeline. 

With Drill 1.11 support for SASL encryption, there is a new 
SaslEncryptionHandler inserted in Drill channel pipeline.This handler takes the 
Drill ByteBuf and encrypt it and stores the encrypted buffer (>= original 
buffer) in another ByteBuf. Now in this way the memory consumption is doubled 
until next handler in pipeline is called when original buffer will be released. 
There is a risk where if multiple connections (say N) happen to do encryption 
on larger Data buffers (say of size D) at same time then each will end up 
doubling the memory consumption at that instance. The total memory consumption 
will be Mc = N*2D. This can happen even without encryption when the connection 
count is doubled (i.e. 2N) which are transferring (D size of data). In 
constrained memory environment this can be an issue if Mc is too large.

To resolve issues in both the scenarios it is required to have flow control in 
place for Drill RPC layer. Basically we can configure High/Low Watermarks 
(based on % of ChannelOutboundbuffer) and ChannelOutboundbuffer (multiple of 
Chunk size) for Drill channel's. Then the application thread which just write 
entire message in one go, need to chunk the message in some smaller sizes 
(possibly configurable). Based on the channel write state, one or more chunk 
should be written to socket. If the channel Writable state is false then 
application thread will block until it get's notified of the state change in 
which case it can again send more chunk downstream. In this way we are 
achieving below:
1) In case when encryption is disabled Netty's ChannelOutboundbuffer will not 
be overwhelmed. It will always have streamline flow of data to send over 
network.
2) In case when encryption is enabled then we will always send smaller chunks 
to the pipeline to encrypt rather than entire Data buffer. This will double the 
memory in smaller units causing less memory pressure.

Note: This is just high level description of the problem and what can be a 
potential solution. It needs more research/prototyping to come up with a proper 
solution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5501) Improve the negotiation of max_wrapped_size for encryption.

2017-05-10 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5501:


 Summary: Improve the negotiation of max_wrapped_size for 
encryption.
 Key: DRILL-5501
 URL: https://issues.apache.org/jira/browse/DRILL-5501
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia
 Fix For: Future


With 1.11 Drill will have the support for encryption using SASL framework. As 
part of encryption negotiation SASL exposes bunch of parameters like QOP, 
strength, maxbuffer and rawsendsize. The details on these parameters can be 
found 
[here|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/javax/security/sasl/Sasl.java#Sasl].
 This JIRA specifically is in reference to _maxbuffer_ and _rawsendsize_ 
parameter.

*rawsendsize* is the maximum plain text size which application should pass to 
wrap function of a mechanism to produce an encoded buffer not exceeding 
*maxbuffer* size. It is retrieved by application after negotiation is done for 
_maxbuffer_.
*maxbuffer* parameter is the maximum received buffer size (encoded) that 
client/server side agrees to receive. It is configurable in Drill using
*encryption.sasl.max_wrapped_size* configuration for client and bit to bit 
connections. This parameter is global for all the supported mechanisms 
configured. For an optimization this configuration is also used by each 
connection SaslDecryptionHandler to create a pre-allocated buffer of that size. 
Since each encrypted chunk will not exceed this configured value hence we can 
re-use the same buffer each time to copy the encrypted chunk from wire and 
decrypt it, instead of creating a buffer each time a message is received. Since 
currently GSSAPI (or Kerberos) is the only available mechanism which is 
supported by Drill with encryption so having this global parameter is fine. But 
in future if more mechanisms are supported then it can be a issue, if the 
mechanism doesn't support negotiation of this parameter instead just defines 
internally to be a fixed value.

As per [SASL RFC|https://tools.ietf.org/html/rfc4422#section-3.7]:
_The maximum size that each side expects is fixed by the mechanism, either 
through negotiation or by its specification_

This means this parameter can either be negotiated or can be fixed by 
mechanisms. So in a case let say the parameter is configured to a value of 1MB 
and there are 2 mechanisms which are configured {kerberos, custom}. custom 
mechanism has defined fixed value of this parameter to be 64K whereas kerberos 
can negotiate for 1MB size (since max allowed by GSSAPI is 16MB). Now each 
connection will have a pre-define buffer of 1MB allocated in it's 
SaslDecryptionHandler. For connection using custom mechanism there is wastage 
in memory since the maximum encoded buffer it will ever receive is 64K. To 
resolve this issue following solution is proposed:

1) Use the drill configuration _max_wrapped_size_ as the global value for  
_maxbuffer_ parameter for all the mechanisms which support negotiation. For 
mechanisms which has it's own pre-defined value of _maxbuffer_ the configured 
value will be ignored.
2) In Drill we implement a factory like KerberosFactory / Plain Factory for all 
the supported mechanisms. Each factory will be aware of the behavior of it's 
underlying supported mechanism and use the configured value accordingly i.e. 
with all the bounds checking / ignoring it totally as well. For example: 
* Kerberos factory will know that it supports negotiation of _maxbuffer_ upto 
max value of 16MB. So it can use the Drill configured value and perform the 
bound check before setting it in SASL layer (i.e. when saslClient/saslServer 
are created for negotiation)
* Custom factory will ignore this configuration value since it's underlying 
mechanism has fixed defined value of _maxbuffer_ and will use that.

3) Once the Sasl layer is created the negotiation for the connection will 
happen based on chosen mechanism. After negotiation is completed Drill can 
retrieve the value of *maxbuffer* and corresponding *rawsendsize*  using 
saslClient/saslServer.getNegotiatedProperty() and set that in the 
EncryptionContext instance of that connection.
4) I didn't found that the value of *maxbuffer* parameter is updated based on 
negotiation internally in mechanism implementation (looked for GSSAPI) .So it 
looks to me mechanism expects application to pass correct value within bounds. 
Hence the need to check for bounds of configured value in corresponding factory 
is needed (as mentioned in step 2), so that when the parameter value is 
retrieved after negotiation the connection get's the correct value in it's 
EncryptionContext.
5) Later when security handlers are added as part of each connection, it's 
corresponding SaslDecryptionHandler will use the buffer size in 
EncryptionContext (which was updated after 

[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption

2017-05-10 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4335:
-
Labels: security  (was: ready-to-commit security)

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5485) Remove WebServer dependency on DrillClient

2017-05-07 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5485:


 Summary: Remove WebServer dependency on DrillClient
 Key: DRILL-5485
 URL: https://issues.apache.org/jira/browse/DRILL-5485
 Project: Apache Drill
  Issue Type: Improvement
  Components: Web Server
Reporter: Sorabh Hamirwasia
 Fix For: 1.11.0


With encryption support using SASL, client's won't be able to authenticate 
using PLAIN mechanism when encryption is enabled on the cluster. Today 
WebServer which is embedded inside Drillbit creates a DrillClient instance for 
each WebClient session. And the WebUser is authenticated as part of 
authentication between DrillClient instance and Drillbit using PLAIN mechanism. 
But with encryption enabled this will fail since encryption doesn't support 
authentication using PLAN mechanism, hence no WebClient can connect to a 
Drillbit. There are below issues as well with this approach:
1) Since DrillClient is used per WebUser session this is expensive as it has 
heavyweight RPC layer for DrillClient and all it's dependencies. 
2) If the Foreman for a WebUser is also selected to be a different node then 
there will be extra hop of transferring data back to WebClient.
To resolve all the above issue it would be better to authenticate the WebUser 
locally using the Drillbit on which WebServer is running without creating 
DrillClient instance. We can use the local PAMAuthenticator to authenticate the 
user. After authentication is successful the local Drillbit can also serve as 
the Foreman for all the queries submitted by WebUser. This can be achieved by 
submitting the query to the local Drillbit Foreman work queue. This will also 
remove the requirement to encrypt the channel opened between WebServer 
(DrillClient) and selected Drillbit since with this approach there won't be any 
physical channel opened between them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-25 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983672#comment-15983672
 ] 

Sorabh Hamirwasia commented on DRILL-4335:
--

Thanks!. I will add you and other reviewer's in the pull request for C++ 
changes.

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-25 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983598#comment-15983598
 ] 

Sorabh Hamirwasia commented on DRILL-4335:
--

[~laurentgo] - Hi Laurent, Did you get chance to look into the changes ? It 
would be great if you can help to review the pull request.

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-11 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964998#comment-15964998
 ] 

Sorabh Hamirwasia commented on DRILL-4335:
--

[~laurentgo]
I have posted Drill C++ client changes as well. Can you please help to review ?

Pull Request:
https://github.com/apache/drill/pull/809


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5415) Improve Fixture Builder to configure client properties and keep collection type properties for server

2017-04-10 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5415:
-
Labels: ready-to-commit  (was: )

> Improve Fixture Builder to configure client properties and keep collection 
> type properties for server
> -
>
> Key: DRILL-5415
> URL: https://issues.apache.org/jira/browse/DRILL-5415
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.11.0
>
>
> There are 2 improvements which are made as part of this pull request.
> 1) The Fixture Builder framework converts all the config properties for 
> Drillbit into string type. But there are certain configurations for 
> authentication (like auth.mechanism) which are expected to be list type. Thus 
> there will be failure during type check. Change to keep collections type 
> config value as is and insert those config value after string types are 
> inserted.
> 2) The Fixture Builder framework when builds it tries to apply any system 
> options / session options (if set) for which it creates a default client. 
> Hence with  cluster enabled for authentication this default client will not 
> provide any connection parameters for authentication and will fail to 
> connect. Allow Fixture Builder to accept client related properties as well so 
> that can be used while creating default client to connect to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5415) Improve Fixture Builder to configure client properties and keep collection type properties for server

2017-04-05 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5415:


 Summary: Improve Fixture Builder to configure client properties 
and keep collection type properties for server
 Key: DRILL-5415
 URL: https://issues.apache.org/jira/browse/DRILL-5415
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build & Test
Affects Versions: 1.11.0
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia
Priority: Minor
 Fix For: 1.11.0


There are 2 improvements which are made as part of this pull request.
1) The Fixture Builder framework converts all the config properties for 
Drillbit into string type. But there are certain configurations for 
authentication (like auth.mechanism) which are expected to be list type. Thus 
there will be failure during type check. Change to keep collections type config 
value as is and insert those config value after string types are inserted.
2) The Fixture Builder framework when builds it tries to apply any system 
options / session options (if set) for which it creates a default client. Hence 
with  cluster enabled for authentication this default client will not provide 
any connection parameters for authentication and will fail to connect. Allow 
Fixture Builder to accept client related properties as well so that can be used 
while creating default client to connect to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption

2017-04-03 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4335:
-
Attachment: (was: ApacheDrillEncryptionUsingSASLDesign.pdf)

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption

2017-04-03 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4335:
-
Attachment: ApacheDrillEncryptionUsingSASLDesign.pdf

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf, 
> ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-03 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15953091#comment-15953091
 ] 

Sorabh Hamirwasia commented on DRILL-4335:
--

Updated pull request and design documents based on recent findings with regards 
to the protocol mismatch between Cyrus SASL and Java SASL implementations. 
(Section Note 8 in the document discuss about it).

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf, 
> ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5316) C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children completed with ZOK

2017-03-09 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5316:
-
Labels: ready-to-commit  (was: )

> C++ Client Crashes When drillbitsVector.count is 0 after zoo_get_children 
> completed with ZOK
> 
>
> Key: DRILL-5316
> URL: https://issues.apache.org/jira/browse/DRILL-5316
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Rob Wu
>Priority: Critical
>  Labels: ready-to-commit
>
> When connecting to drillbit with Zookeeper, occasionally the C++ client would 
> crash without any reason.
> A further look into the code revealed that during this call 
> rc=zoo_get_children(p_zh.get(), m_path.c_str(), 0, ); 
> zoo_get_children returns ZOK (0) but drillbitsVector.count is 0.
> This causes drillbits to stay empty and thus 
> causes err = zook.getEndPoint(drillbits[drillbits.size() -1], endpoint); to 
> crash
> Size check should be done to prevent this from happening



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-4335) Apache Drill should support network encryption

2017-03-09 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4335:
-
Attachment: ApacheDrillEncryptionUsingSASLDesign.pdf

Design document for encryption support in DRILL using SASL Framework.

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-03-09 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15904070#comment-15904070
 ] 

Sorabh Hamirwasia commented on DRILL-4335:
--

[~laurentgo],
Yes there are multiple copies (~3) involved here. Below is the summary of it. I 
am not sure if there is any way to avoid these unless we use heap array for 
Drill ByteBuff as well.
1) Converting the paylod to encrypt from Drill ByteBuff which is on direct 
memory to the byte array which is on heap.
2) Copy inside wrap/unwrap method which allocates a new byte array internally 
to copy the provided input.
3) Copying the output encrypted byte array back to Drill ByteBuff to transfer 
over network.

We will share the estimation/benchmark to quantify the impact on throughput 
once available. Netty's SSL/TLS will have same impact since the internal 
implementation also uses the jdk's wrap/unwrap methods which involves same 
amount of copying. We are planning to provide SSL support in future too. SASL 
is mainly focussed for the use case where we have Kerberos setup. If user wants 
privacy over channel along with Kerberos authentication then encryption using 
SASL will help there. Sorry for the delay but I have finally updated the design 
document to reflect changes with current implementation and attaching that too 
for review.

Note: This pull request doesn't have C++ client side changes which I am 
planning to post as separate pull request.

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (DRILL-4335) Apache Drill should support network encryption

2017-03-06 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia reassigned DRILL-4335:


Assignee: Sorabh Hamirwasia

> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5313) C++ client build failure on linux

2017-03-02 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5313:


 Summary: C++ client build failure on linux
 Key: DRILL-5313
 URL: https://issues.apache.org/jira/browse/DRILL-5313
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - C++
Affects Versions: 1.10
Reporter: Sorabh Hamirwasia
Assignee: Laurent Goujon


We are seeing below errors while building Drill C++ client on linux platform:

[root@qa-node161 build]# make
[  6%] Built target y2038
[ 38%] Built target protomsgs
[ 41%] Building CXX object 
src/clientlib/CMakeFiles/drillClient.dir/drillClientImpl.cpp.o
/root/drill/drill/contrib/native/client/src/clientlib/drillClientImpl.cpp: In 
function ‘void Drill::updateLikeFilter(exec::user::LikeFilter&, const 
std::string&)’:
/root/drill/drill/contrib/native/client/src/clientlib/drillClientImpl.cpp:782: 
error: ‘s_searchEscapeString’ is not a member of ‘Drill::meta::DrillMetadata’
make[2]: *** [src/clientlib/CMakeFiles/drillClient.dir/drillClientImpl.cpp.o] 
Error 1
make[1]: *** [src/clientlib/CMakeFiles/drillClient.dir/all] Error 2
make: *** [all] Error 2

It looks to be related to one of Laurent's pull request below:
https://github.com/apache/drill/pull/712



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5258) Allow "extended" mock tables access from SQL queries

2017-02-24 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5258:
-
Labels: ready-to-commit  (was: )

> Allow "extended" mock tables access from SQL queries
> 
>
> Key: DRILL-5258
> URL: https://issues.apache.org/jira/browse/DRILL-5258
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.10
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.10
>
>
> DRILL-5152 provided a simple way to generate sample data in SQL using a new, 
> simplified version of the mock data generator. This approach is very 
> convenient, but is inherently limited. For example, the limited syntax 
> available in SQL does not encoding much information about columns such as 
> repeat count, data generator or so on. The simple SQL approach does not allow 
> generating multiple groups of data.
> However, all these features are present in the original mock data source via 
> a special JSON configuration file. Previously, only physical plans could 
> access that extended syntax.
> This ticket requests a SQL interface to the extended mock data source:
> {code}
> SELECT * FROM `mock`.`example/mock-options.json`
> {code}
> Mock data source options are always stored as a JSON file. Since the existing 
> mock data generator for SQL never uses JSON files, a simple rule is that if 
> the table name ends in ".json" then it is a specification, else the 
> information is encoded in table and column names.
> The format of the data generation syntax is documented in the mock data 
> source classes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5260) Refinements to new "Cluster Fixture" test framework

2017-02-24 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5260:
-
Labels: ready-to-commit  (was: )

> Refinements to new "Cluster Fixture" test framework
> ---
>
> Key: DRILL-5260
> URL: https://issues.apache.org/jira/browse/DRILL-5260
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.10
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.10
>
>
> Roll-up of a number of enhancements to the cluster fixture framework.
> * Config option to suppress printing of CSV and other output. (Allows 
> printing for single tests, not printing when running from Maven.)
> * Parsing of query profiles to extract plan and run time information.
> * Fix bug in log fixture when enabling logging for a package.
> * Improved ZK support.
> * Set up the new CTTAS default temporary workspace for tests.
> * Revise TestDrillbitResiliance to use the new framework.
> * Revise TestWindowFrame to to use the new framework.
> * Revise TestMergeJoinWithSchemaChanges to use the new framework.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5098) Improving fault tolerance for connection between client and foreman node.

2017-01-20 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5098:
-
Description: 
With DRILL-5015 we allowed support for specifying multiple Drillbits in 
connection string and randomly choosing one out of it. Over time some of the 
Drillbits specified in the connection string may die and the client can fail to 
connect to Foreman node if random selection happens to be of dead Drillbit.
Even if ZooKeeper is used for selecting a random Drillbit from the registered 
one there is a small window when client selects one Drillbit and then that 
Drillbit went down. The client will fail to connect to this Drillbit and error 
out. 

Instead if we try multiple Drillbits (configurable tries count through 
connection string) then the probability of hitting this error window will 
reduce in both the cases improving fault tolerance. During further 
investigation it was also found that if there is Authentication failure then we 
throw that error as generic RpcException. We need to improve that as well to 
capture this case explicitly since in case of Auth failure we don't want to try 
multiple Drillbits.

Connection string example with new parameter:
jdbc:drill:drillbit=[:][,[:]...;tries=5

  was:
With DRILL-5015 we allowed support for specifying multiple Drillbits in 
connection string and randomly choosing one out of it. Over time some of the 
Drillbits specified in the connection string may die and the client can fail to 
connect to Foreman node if random selection happens to be of dead Drillbit.
Even if ZooKeeper is used for selecting a random Drillbit from the registered 
one there is a small window when client selects one Drillbit and then that 
Drillbit went down. The client will fail to connect to this Drillbit and error 
out. 

Instead if we try multiple Drillbits (configurable tries count through 
connection string) then the probability of hitting this error window will 
reduce in both the cases improving fault tolerance. During further 
investigation it was also found that if there is Authentication failure then we 
throw that error as generic RpcException. We need to improve that as well to 
capture this case explicitly since in case of Auth failure we don't want to try 
multiple Drillbits.


> Improving fault tolerance for connection between client and foreman node.
> -
>
> Key: DRILL-5098
> URL: https://issues.apache.org/jira/browse/DRILL-5098
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - JDBC
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.10
>
>
> With DRILL-5015 we allowed support for specifying multiple Drillbits in 
> connection string and randomly choosing one out of it. Over time some of the 
> Drillbits specified in the connection string may die and the client can fail 
> to connect to Foreman node if random selection happens to be of dead Drillbit.
> Even if ZooKeeper is used for selecting a random Drillbit from the registered 
> one there is a small window when client selects one Drillbit and then that 
> Drillbit went down. The client will fail to connect to this Drillbit and 
> error out. 
> Instead if we try multiple Drillbits (configurable tries count through 
> connection string) then the probability of hitting this error window will 
> reduce in both the cases improving fault tolerance. During further 
> investigation it was also found that if there is Authentication failure then 
> we throw that error as generic RpcException. We need to improve that as well 
> to capture this case explicitly since in case of Auth failure we don't want 
> to try multiple Drillbits.
> Connection string example with new parameter:
> jdbc:drill:drillbit=[:][,[:]...;tries=5



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5126) Provide simplified, unified "cluster fixture" for tests

2017-01-19 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5126:
-
Labels: ready-to-commit  (was: )

> Provide simplified, unified "cluster fixture" for tests
> ---
>
> Key: DRILL-5126
> URL: https://issues.apache.org/jira/browse/DRILL-5126
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>  Labels: ready-to-commit
>
> Drill provides a robust selection of test frameworks that have evolved to 
> satisfy the needs of a variety of test cases. For newbies, however, the 
> result is a bewildering array of ways to do basically the same thing: set up 
> an embedded Drill cluster, run queries and check results.
> Further, some key test settings are distributed: some are in the pom.xml 
> file, some in config files stored as resources, some in hard-coded settings 
> in base test classes.
> Also, some test base classes helpfully set up a test cluster, but then 
> individual tests need a different config, so they immediately tear down the 
> default cluster and create a new one.
> This ticket proposes a new test framework, available for new tests, that 
> combines the best of the existing test frameworks into a single, easy-to-use 
> package.
> * Builder for the cluster
> * Accept config-time options
> * Accept run-time session and system options
> * Specify number of Drillbits
> * Simplified API for the most common options
> * AutoCloseable for use in try-with-resources statements
> * Integration with existing test builder classes
> And so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5152) Enhance the mock data source: better data, SQL access

2017-01-10 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5152:
-
Labels: ready-to-commit  (was: )

> Enhance the mock data source: better data, SQL access
> -
>
> Key: DRILL-5152
> URL: https://issues.apache.org/jira/browse/DRILL-5152
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>  Labels: ready-to-commit
>
> Drill provides a mock data storage engine that generates random data. The 
> mock engine is used in some older unit tests that need a volume of data, but 
> that are not too particular about the details of the data.
> The mock data source continues to have use even for modern tests. For 
> example, the work in the external storage batch requires tests with varying 
> amounts of data, but the exact form of the data is not important, just the 
> quantity. For example, if we want to ensure that spilling happens at various 
> trigger points, we need to read the right amount of data for that trigger.
> The existing mock data source has two limitations:
> 1. It generates only "black/white" (alternating) values, which is awkward for 
> use in sorting.
> 2. The mock generator is accessible only from a physical plan, but not from 
> SQL queries.
> This enhancement proposes to fix both limitations:
> 1. Generate a uniform, randomly distributed set of values.
> 2. Provide an encoding that lets a SQL query specify the data to be generated.
> Example SQL query:
> {code}
> SELECT id_i, name_s50 FROM `mock`.employee_10K;
> {code}
> The above says to generate two fields: INTEGER (the "_i" suffix) and 
> VARCHAR(50) (the "_s50") suffix; and to generate 10,000 rows (the "_10K" 
> suffix on the table.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5108) Reduce output from Maven git-commit-id-plugin

2016-12-07 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5108:
-
Labels: ready-to-commit  (was: )

> Reduce output from Maven git-commit-id-plugin
> -
>
> Key: DRILL-5108
> URL: https://issues.apache.org/jira/browse/DRILL-5108
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>  Labels: ready-to-commit
>
> The git-commit-id-plugin grabs information from Git to display during a 
> build. It prints many e-mail addresses and other generic project information. 
> As part of the effort to trim down unit test output, we propose to turn off 
> the verbose output from this plugin.
> Specific change:
> {code}
>   
> pl.project13.maven
> git-commit-id-plugin
> ...
> 
>  false
> {code}
> That is, change the verbose setting from true to false.
> In the unlikely event that some build process depends on the verbose output, 
> we can make the setting a configurable parameter, defaulting to false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5098) Improving fault tolerance for connection between client and foreman node.

2016-12-06 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5098:
-
Reviewer: Paul Rogers

> Improving fault tolerance for connection between client and foreman node.
> -
>
> Key: DRILL-5098
> URL: https://issues.apache.org/jira/browse/DRILL-5098
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - JDBC
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>  Labels: doc-impacting
> Fix For: 1.10
>
>
> With DRILL-5015 we allowed support for specifying multiple Drillbits in 
> connection string and randomly choosing one out of it. Over time some of the 
> Drillbits specified in the connection string may die and the client can fail 
> to connect to Foreman node if random selection happens to be of dead Drillbit.
> Even if ZooKeeper is used for selecting a random Drillbit from the registered 
> one there is a small window when client selects one Drillbit and then that 
> Drillbit went down. The client will fail to connect to this Drillbit and 
> error out. 
> Instead if we try multiple Drillbits (configurable tries count through 
> connection string) then the probability of hitting this error window will 
> reduce in both the cases improving fault tolerance. During further 
> investigation it was also found that if there is Authentication failure then 
> we throw that error as generic RpcException. We need to improve that as well 
> to capture this case explicitly since in case of Auth failure we don't want 
> to try multiple Drillbits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5098) Improving fault tolerance for connection between client and foreman node.

2016-12-06 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5098:
-
Labels: doc-impacting  (was: )

> Improving fault tolerance for connection between client and foreman node.
> -
>
> Key: DRILL-5098
> URL: https://issues.apache.org/jira/browse/DRILL-5098
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - JDBC
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>  Labels: doc-impacting
> Fix For: 1.10
>
>
> With DRILL-5015 we allowed support for specifying multiple Drillbits in 
> connection string and randomly choosing one out of it. Over time some of the 
> Drillbits specified in the connection string may die and the client can fail 
> to connect to Foreman node if random selection happens to be of dead Drillbit.
> Even if ZooKeeper is used for selecting a random Drillbit from the registered 
> one there is a small window when client selects one Drillbit and then that 
> Drillbit went down. The client will fail to connect to this Drillbit and 
> error out. 
> Instead if we try multiple Drillbits (configurable tries count through 
> connection string) then the probability of hitting this error window will 
> reduce in both the cases improving fault tolerance. During further 
> investigation it was also found that if there is Authentication failure then 
> we throw that error as generic RpcException. We need to improve that as well 
> to capture this case explicitly since in case of Auth failure we don't want 
> to try multiple Drillbits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5015) As per documentation, when issuing a list of drillbits in the connection string, we always attempt to connect only to the first one

2016-11-21 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5015:
-
Labels: ready-to-commit  (was: )

> As per documentation, when issuing a list of drillbits in the connection 
> string, we always attempt to connect only to the first one
> ---
>
> Key: DRILL-5015
> URL: https://issues.apache.org/jira/browse/DRILL-5015
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.8.0, 1.9.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sudheesh Katkam
>  Labels: ready-to-commit
>
> When trying to connect to a Drill cluster by specifying more than 1 drillbits 
> to connect to, we always attempt to connect to only the first drillbit.
> As an example, we tested against a pair of drillbits, but we always connect 
> to the first entry in the CSV list by querying for the 'current' drillbit. 
> The remaining entries are never attempted.
> [root@pssc-60 agileSqlPerfTests]# /opt/mapr/drill/drill-1.8.0/bin/sqlline  -u 
>  "jdbc:drill:schema=dfs.tmp;drillbit=pssc-61:31010,pssc-62:31010" -f 
> whereAmI.q  | grep -v logback
> 1/1  select * from sys.drillbits where `current`;
> +-++---++--+
> |hostname | user_port  | control_port  | data_port  | current  |
> +-++---++--+
> | pssc-61.qa.lab  | 31010  | 31011 | 31012  | true |
> +-++---++--+
> 1 row selected (0.265 seconds)
> Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
> apache drill 1.8.0 
> "a little sql for your nosql"
> This property is meant for use by clients when not wanting to overload the ZK 
> for fetching a list of existing Drillbits, but the behaviour doesn't match 
> the documentation. 
> [Making a Direct Drillbit Connection | 
> https://drill.apache.org/docs/using-the-jdbc-driver/#using-the-jdbc-url-format-for-a-direct-drillbit-connection
>  ]
> We need to randomly shuffle between this list and If an entry in the shuffled 
> list is unreachable, we need to try for the next entry in the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4990) Use new HDFS API access instead of listStatus to check if users have permissions to access workspace.

2016-11-18 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4990:
-
Assignee: Padma Penumarthy  (was: Sorabh Hamirwasia)

> Use new HDFS API access instead of listStatus to check if users have 
> permissions to access workspace.
> -
>
> Key: DRILL-4990
> URL: https://issues.apache.org/jira/browse/DRILL-4990
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.8.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
> Fix For: 1.9.0
>
>
> For every query, we build the schema tree 
> (runSQL->getPlan->getNewDefaultSchema->getRootSchema). All workspaces in all 
> storage plugins are checked and are added to the schema tree if they are 
> accessible by the user who initiated the query.  For file system plugin, 
> listStatus API is used to check if  the workspace is accessible or not 
> (WorkspaceSchemaFactory.accessible) by the user.  The idea seem to be if the 
> user does not have access to file(s) in the workspace, listStatus will 
> generate an exception and we return false. But, listStatus (which lists all 
> the entries of a directory) is an expensive operation when there are large 
> number of files in the directory. A new API is added in Hadoop 2.6 called 
> access (HDFS-6570) which provides the ability to check if the user has 
> permissions on a file/directory.  Use this new API instead of listStatus. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-5015) As per documentation, when issuing a list of drillbits in the connection string, we always attempt to connect only to the first one

2016-11-08 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5015:
-
Description: 
When trying to connect to a Drill cluster by specifying more than 1 drillbits 
to connect to, we always attempt to connect to only the first drillbit.
As an example, we tested against a pair of drillbits, but we always connect to 
the first entry in the CSV list by querying for the 'current' drillbit. The 
remaining entries are never attempted.
[root@pssc-60 agileSqlPerfTests]# /opt/mapr/drill/drill-1.8.0/bin/sqlline  -u  
"jdbc:drill:schema=dfs.tmp;drillbit=pssc-61:31010,pssc-62:31010" -f whereAmI.q  
| grep -v logback

1/1  select * from sys.drillbits where `current`;
+-++---++--+
|hostname | user_port  | control_port  | data_port  | current  |
+-++---++--+
| pssc-61.qa.lab  | 31010  | 31011 | 31012  | true |
+-++---++--+
1 row selected (0.265 seconds)
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
apache drill 1.8.0 
"a little sql for your nosql"
This property is meant for use by clients when not wanting to overload the ZK 
for fetching a list of existing Drillbits, but the behaviour doesn't match the 
documentation. 
Making a Direct Drillbit Connection
We need to randomly shuffle between this list and If an entry in the shuffled 
list is unreachable, we need to try for the next entry in the list.

  was:
When trying to connect to a Drill cluster by specifying more than 1 drillbits 
to connect to, we always attempt to connect to only the first drillbit.
As an example, we tested against a pair of drillbits, but we always connect to 
the first entry in the CSV list by querying for the 'current' drillbit. The 
remaining entries are never attempted.
[root@pssc-60 agileSqlPerfTests]# /opt/mapr/drill/drill-1.8.0/bin/sqlline  -u  
"jdbc:drill:schema=dfs.tmp;drillbit=pssc-61:31010,pssc-62:31010" -f whereAmI.q  
| grep -v logback

1/1  select * from sys.drillbits where `current`;
+-++---++--+
|hostname | user_port  | control_port  | data_port  | current  |
+-++---++--+
| pssc-61.qa.lab  | 31010  | 31011 | 31012  | true |
+-++---++--+
1 row selected (0.265 seconds)
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
apache drill 1.8.0 
"a little sql for your nosql"
This property is meant for use by clients when not wanting to overload the ZK 
for fetching a list of existing Drillbits, but the behaviour doesn't match the 
documentation. 
Making a Direct Drillbit Connection
We ned too raandomly shuffle between this list and If an entry in the shuffled 
list is unreachable, we need to try for the next entry in the list.


> As per documentation, when issuing a list of drillbits in the connection 
> string, we always attempt to connect only to the first one
> ---
>
> Key: DRILL-5015
> URL: https://issues.apache.org/jira/browse/DRILL-5015
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.8.0, 1.9.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>
> When trying to connect to a Drill cluster by specifying more than 1 drillbits 
> to connect to, we always attempt to connect to only the first drillbit.
> As an example, we tested against a pair of drillbits, but we always connect 
> to the first entry in the CSV list by querying for the 'current' drillbit. 
> The remaining entries are never attempted.
> [root@pssc-60 agileSqlPerfTests]# /opt/mapr/drill/drill-1.8.0/bin/sqlline  -u 
>  "jdbc:drill:schema=dfs.tmp;drillbit=pssc-61:31010,pssc-62:31010" -f 
> whereAmI.q  | grep -v logback
> 1/1  select * from sys.drillbits where `current`;
> +-++---++--+
> |hostname | user_port  | control_port  | data_port  | current  |
> +-++---++--+
> | pssc-61.qa.lab  | 31010  | 31011 | 31012  | true |
> +-++---++--+
> 1 row selected (0.265 seconds)
> Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
> apache drill 1.8.0 
> "a little sql for your nosql"
> This property is meant for use by clients when not wanting to overload the ZK 
> for fetching a list of existing Drillbits, but the behaviour doesn't 

[jira] [Created] (DRILL-5015) As per documentation, when issuing a list of drillbits in the connection string, we always attempt to connect only to the first one

2016-11-08 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5015:


 Summary: As per documentation, when issuing a list of drillbits in 
the connection string, we always attempt to connect only to the first one
 Key: DRILL-5015
 URL: https://issues.apache.org/jira/browse/DRILL-5015
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.8.0, 1.9.0
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia


When trying to connect to a Drill cluster by specifying more than 1 drillbits 
to connect to, we always attempt to connect to only the first drillbit.
As an example, we tested against a pair of drillbits, but we always connect to 
the first entry in the CSV list by querying for the 'current' drillbit. The 
remaining entries are never attempted.
[root@pssc-60 agileSqlPerfTests]# /opt/mapr/drill/drill-1.8.0/bin/sqlline  -u  
"jdbc:drill:schema=dfs.tmp;drillbit=pssc-61:31010,pssc-62:31010" -f whereAmI.q  
| grep -v logback

1/1  select * from sys.drillbits where `current`;
+-++---++--+
|hostname | user_port  | control_port  | data_port  | current  |
+-++---++--+
| pssc-61.qa.lab  | 31010  | 31011 | 31012  | true |
+-++---++--+
1 row selected (0.265 seconds)
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl
apache drill 1.8.0 
"a little sql for your nosql"
This property is meant for use by clients when not wanting to overload the ZK 
for fetching a list of existing Drillbits, but the behaviour doesn't match the 
documentation. 
Making a Direct Drillbit Connection
We ned too raandomly shuffle between this list and If an entry in the shuffled 
list is unreachable, we need to try for the next entry in the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4706) Fragment planning causes Drillbits to read remote chunks when local copies are available

2016-11-08 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4706:
-
Assignee: Padma Penumarthy  (was: Sorabh Hamirwasia)

> Fragment planning causes Drillbits to read remote chunks when local copies 
> are available
> 
>
> Key: DRILL-4706
> URL: https://issues.apache.org/jira/browse/DRILL-4706
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.6.0
> Environment: CentOS, RHEL
>Reporter: Kunal Khatua
>Assignee: Padma Penumarthy
>  Labels: performance, planning
>
> When a table (datasize=70GB) of 160 parquet files (each having a single 
> rowgroup and fitting within one chunk) is available on a 10-node setup with 
> replication=3 ; a pure data scan query causes about 2% of the data to be read 
> remotely. 
> Even with the creation of metadata cache, the planner is selecting a 
> sub-optimal plan of executing the SCAN fragments such that some of the data 
> is served from a remote server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4964) Drill fails to connect to hive metastore after hive metastore is restarted unless drillbits are restarted also

2016-10-25 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-4964:


 Summary: Drill fails to connect to hive metastore after hive 
metastore is restarted unless drillbits are restarted also
 Key: DRILL-4964
 URL: https://issues.apache.org/jira/browse/DRILL-4964
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.8.0, 1.9.0
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia


It is found that if Hive Metastore is restarted then Drillbit also needs to be 
restarted to further query and connect to hive metastore. 

Repro Steps:
===
1) Start HiveMetastore and drillbit.
2) Start Drillbit client with Scheman as hive and run a simple query like "show 
databases"
   Command to start client: sqlline -u "jdbc:drill:schema=hive;drillbit="
3) restart hive metastore
4) Execute same query "show databases" on existing drillclient or new one. You 
will see that hive default database is not listed. If you query any hive data 
then it will fail.

Log snippet from drillbit.log:
==

2016-10-25 18:32:00,561 [27eff86e-e8fb-3d91-eb88-4af75ff6d174:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
27eff86e-e8fb-3d91-eb88-4af75ff6d174: show databases
2016-10-25 18:32:00,563 [27eff86e-e8fb-3d91-eb88-4af75ff6d174:foreman] DEBUG 
o.a.d.e.s.h.HBaseStoragePluginConfig - Initializing HBase StoragePlugin 
configuration with zookeeper quorum 'localhost', port '2181'.
2016-10-25 18:32:00,595 [27eff86e-e8fb-3d91-eb88-4af75ff6d174:foreman] WARN  
o.a.d.e.s.h.schema.HiveSchemaFactory - Failure while attempting to access 
HiveDatabase 'default'.
java.util.concurrent.ExecutionException: MetaException(message:Got exception: 
org.apache.thrift.transport.TTransportException null)
at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
 ~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
 ~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
~[guava-18.0.jar:na]
at 
com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:137)
 ~[guava-18.0.jar:na]
at 
com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2348)
 ~[guava-18.0.jar:na]
at 
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2320) 
~[guava-18.0.jar:na]
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
 ~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) 
~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.get(LocalCache.java:3937) 
~[guava-18.0.jar:na]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) 
~[guava-18.0.jar:na]
at 
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) 
~[guava-18.0.jar:na]
at 
org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient$HiveClientWithCaching.getDatabases(DrillHiveMetaStoreClient.java:415)
 ~[drill-storage-hive-core-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.getSubSchema(HiveSchemaFactory.java:139)
 [drill-storage-hive-core-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.hive.schema.HiveSchemaFactory$HiveSchema.(HiveSchemaFactory.java:133)
 [drill-storage-hive-core-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.hive.schema.HiveSchemaFactory.registerSchemas(HiveSchemaFactory.java:118)
 [drill-storage-hive-core-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.hive.HiveStoragePlugin.registerSchemas(HiveStoragePlugin.java:100)
 [drill-storage-hive-core-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.StoragePluginRegistryImpl$DrillSchemaFactory.registerSchemas(StoragePluginRegistryImpl.java:365)
 [drill-java-exec-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.SchemaTreeProvider.createRootSchema(SchemaTreeProvider.java:72)
 [drill-java-exec-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.store.SchemaTreeProvider.createRootSchema(SchemaTreeProvider.java:61)
 [drill-java-exec-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.QueryContext.getRootSchema(QueryContext.java:147) 
[drill-java-exec-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.QueryContext.getRootSchema(QueryContext.java:137) 
[drill-java-exec-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.QueryContext.getNewDefaultSchema(QueryContext.java:123)
 [drill-java-exec-1.9.0-SNAPSHOT.jar:1.9.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:62)
 

[jira] [Commented] (DRILL-4876) Remain disconnected connection

2016-10-16 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580360#comment-15580360
 ] 

Sorabh Hamirwasia commented on DRILL-4876:
--

Hi Takuya,
You can close this JIRA ticket and please open a new one for the enhancement. 
For now let's mark it for Furture. It would be great to put a link for this 
issue in the new JIRA as well just to provide some more background information 
in future.

Thanks,
Sorabh

> Remain disconnected connection
> --
>
> Key: DRILL-4876
> URL: https://issues.apache.org/jira/browse/DRILL-4876
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
> Environment: CentOS 7
>Reporter: Takuya Kojima
>Assignee: Sorabh Hamirwasia
>Priority: Minor
> Attachments: 1_normal.png, 2_after_restart.png, 
> 3_try_to_connect_after_restart.png, 
> 4_disconnected_after_minEvictableIdleTimeMillis.png, 
> 5_after_disconnected.png, DrillClientConnectionResetTest.java, 
> drill-connection-pool.txt
>
>
> I'm using drill via Java Application on Tomcat with JDBC driver.
> I found that disconnected connection is not released when restart a drillbit.
> Drillbit is restarted, but JDBC's connection keeps to try to connect the 
> connection which started before restart.
> Expected behavior is its connection release and reconnect when drillbit is 
> restarted, but as a matter of fact, the connection will be released after 
> elapsed time of "minEvictableIdleTimeMillis" setting.
> As a result, the application can't connect in the meantime despite drillbit 
> is active.
> I thought this is not major issue, but Postgres and Vertica's JDBC driver 
> works well in the same situation. I spend the much time to identify the 
> cause, so I create a new issue of this.
> The attached is log and JMX's monitor graph with 1.6.0's JDBC driver, but I 
> also get it with 1.7.0 and 1.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4841) Use user server event loop group for web clients

2016-10-07 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia reassigned DRILL-4841:


Assignee: Sudheesh Katkam  (was: Sorabh Hamirwasia)

Assigning to Sudheesh for resolving review comments.

> Use user server event loop group for web clients
> 
>
> Key: DRILL-4841
> URL: https://issues.apache.org/jira/browse/DRILL-4841
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>Priority: Minor
>
> Currently we spawn an event loop group for handling requests from clients. 
> This group should also be used to handles responses (from server) for web 
> clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4876) Remain disconnected connection

2016-10-07 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-4876:
-
Attachment: DrillClientConnectionResetTest.java

Test to show usage with JDBC connection and effect of loosing connection with 
foreman Drillbit.

> Remain disconnected connection
> --
>
> Key: DRILL-4876
> URL: https://issues.apache.org/jira/browse/DRILL-4876
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
> Environment: CentOS 7
>Reporter: Takuya Kojima
>Assignee: Sorabh Hamirwasia
>Priority: Minor
> Attachments: 1_normal.png, 2_after_restart.png, 
> 3_try_to_connect_after_restart.png, 
> 4_disconnected_after_minEvictableIdleTimeMillis.png, 
> 5_after_disconnected.png, DrillClientConnectionResetTest.java, 
> drill-connection-pool.txt
>
>
> I'm using drill via Java Application on Tomcat with JDBC driver.
> I found that disconnected connection is not released when restart a drillbit.
> Drillbit is restarted, but JDBC's connection keeps to try to connect the 
> connection which started before restart.
> Expected behavior is its connection release and reconnect when drillbit is 
> restarted, but as a matter of fact, the connection will be released after 
> elapsed time of "minEvictableIdleTimeMillis" setting.
> As a result, the application can't connect in the meantime despite drillbit 
> is active.
> I thought this is not major issue, but Postgres and Vertica's JDBC driver 
> works well in the same situation. I spend the much time to identify the 
> cause, so I create a new issue of this.
> The attached is log and JMX's monitor graph with 1.6.0's JDBC driver, but I 
> also get it with 1.7.0 and 1.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4876) Remain disconnected connection

2016-10-07 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556735#comment-15556735
 ] 

Sorabh Hamirwasia commented on DRILL-4876:
--

Hi Takuya,
Thanks for raising this issue. Today the way JDBC driver works is that once it 
looses connection with the drillbit it comes to know about it by the next query 
which is executed and fails. The application has to reconnect using new JDBC 
connection to be able to execute new queries. Part of the reason is we don't 
store any session info for each connection which is needed in cases like - "If 
an application set's a schema before executing any query". So just reconnecting 
will not help here as we will loose all the session specific information. 
Definitely that would be a great enhancement which we can target for future but 
unfortunately it's not supported today.

PFA a small test which I wrote to demonstrate the flow or usage using JDBC 
connection. Let me know if you have any other question.

Thanks,
Sorabh

> Remain disconnected connection
> --
>
> Key: DRILL-4876
> URL: https://issues.apache.org/jira/browse/DRILL-4876
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
> Environment: CentOS 7
>Reporter: Takuya Kojima
>Assignee: Sorabh Hamirwasia
>Priority: Minor
> Attachments: 1_normal.png, 2_after_restart.png, 
> 3_try_to_connect_after_restart.png, 
> 4_disconnected_after_minEvictableIdleTimeMillis.png, 
> 5_after_disconnected.png, DrillClientConnectionResetTest.java, 
> drill-connection-pool.txt
>
>
> I'm using drill via Java Application on Tomcat with JDBC driver.
> I found that disconnected connection is not released when restart a drillbit.
> Drillbit is restarted, but JDBC's connection keeps to try to connect the 
> connection which started before restart.
> Expected behavior is its connection release and reconnect when drillbit is 
> restarted, but as a matter of fact, the connection will be released after 
> elapsed time of "minEvictableIdleTimeMillis" setting.
> As a result, the application can't connect in the meantime despite drillbit 
> is active.
> I thought this is not major issue, but Postgres and Vertica's JDBC driver 
> works well in the same situation. I spend the much time to identify the 
> cause, so I create a new issue of this.
> The attached is log and JMX's monitor graph with 1.6.0's JDBC driver, but I 
> also get it with 1.7.0 and 1.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4504) Create an event loop for each of [user, control, data] RPC components

2016-10-06 Thread Sorabh Hamirwasia (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia reassigned DRILL-4504:


Assignee: Sudheesh Katkam  (was: Sorabh Hamirwasia)

Assigning back to Sudheesh as discussed offline yesterday.

> Create an event loop for each of [user, control, data] RPC components
> -
>
> Key: DRILL-4504
> URL: https://issues.apache.org/jira/browse/DRILL-4504
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - RPC
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>
> + Create an event loop group for each client-server pair (data, client and 
> user)
> Miscellaneous:
> + Move WorkEventBus from exec/rpc/control to exec/work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3751) Query hang when zookeeper is stopped

2016-09-29 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533389#comment-15533389
 ] 

Sorabh Hamirwasia commented on DRILL-3751:
--

Yes I did try with one ZK running on a 4 node cluster. I ran the same query but 
on different data which is very big file (both json and parquet).

Also if you see in this apache drill doc's: 
http://drill.apache.org/docs/install-drill-introduction/ 
It does mention that clustered or multi-server installation of ZK is one of the 
prerequisites. Pasting the context from the above doc:

"Choose distributed mode to use Drill in a clustered Hadoop environment. A 
clustered (multi-server) installation of ZooKeeper is one of the 
prerequisites." 

> Query hang when zookeeper is stopped
> 
>
> Key: DRILL-3751
> URL: https://issues.apache.org/jira/browse/DRILL-3751
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> I see an indefinite hang on sqlline prompt, issue a long running query and 
> then stop zookeeper process when the query is still being executed. Sqlline 
> prompt is never returned and it hangs showing the below stack trace. I am on 
> master.
> Steps to reproduce the problem
> clush -g khurram service mapr-warden stop
> clush -g khurram service mapr-warden start
> Issue long running query from sqlline
> While query is running, stop zookeeper using script.
> To stop zookeeper 
> {code}
> [root@centos-01 bin]# ./zkServer.sh stop
> JMX enabled by default
> Using config: /opt/mapr/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
> Stopping zookeeper ... STOPPED
> {code}
> Issue below long running query from sqlline
> {code}
> ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 800;
> ...
> | 7.40907649723E8  | g|
> | 1.12378007695E9  | d|
> 03:03:28.482 [CuratorFramework-0] ERROR org.apache.curator.ConnectionState - 
> Connection timed out for connection string (10.10.100.201:5181) and timeout 
> (5000) / elapsed (5013)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
> ConnectionLoss
>   at 
> org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
>  [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:807)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:793)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275)
>  [curator-framework-2.5.0.jar:na]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {code}
> Here is the stack for sqlline process
> {code}
> [root@centos-01 bin]# /usr/java/jdk1.7.0_45/bin/jstack 32136
> 2015-09-05 03:21:52
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x7f8328003800 nid=0x27f1 waiting on 
> condition [0x]
>java.lang.Thread.State: RUNNABLE
> "CuratorFramework-0-EventThread" daemon prio=10 tid=0x012fd800 
> nid=0x26e1 waiting on condition [0x7f8317c2e000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007e2117798> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)
> "CuratorFramework-0-SendThread(centos-01.qa.lab:5181)" daemon prio=10 
> 

[jira] [Commented] (DRILL-3751) Query hang when zookeeper is stopped

2016-09-28 Thread Sorabh Hamirwasia (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531240#comment-15531240
 ] 

Sorabh Hamirwasia commented on DRILL-3751:
--

Hi Khurram,
I was trying to reproduce the scenario (following the steps listed above) to 
see why the sqlline client hangs using both json and parquet data, but was not 
able to. The query is executing until completion for me and also the state on 
WebUI is shown as Completed. Can you please try to reproduce it with latest 
drill version (I am using locally build 1.9) ? 

About the exceptions seen on sqlline prompt those are expected as part of 
CuratorFramework threads since they have retry logic to try to connect to 
Zookeeper.

> Query hang when zookeeper is stopped
> 
>
> Key: DRILL-3751
> URL: https://issues.apache.org/jira/browse/DRILL-3751
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Priority: Critical
> Fix For: Future
>
>
> I see an indefinite hang on sqlline prompt, issue a long running query and 
> then stop zookeeper process when the query is still being executed. Sqlline 
> prompt is never returned and it hangs showing the below stack trace. I am on 
> master.
> Steps to reproduce the problem
> clush -g khurram service mapr-warden stop
> clush -g khurram service mapr-warden start
> Issue long running query from sqlline
> While query is running, stop zookeeper using script.
> To stop zookeeper 
> {code}
> [root@centos-01 bin]# ./zkServer.sh stop
> JMX enabled by default
> Using config: /opt/mapr/zookeeper/zookeeper-3.4.5/bin/../conf/zoo.cfg
> Stopping zookeeper ... STOPPED
> {code}
> Issue below long running query from sqlline
> {code}
> ./sqlline -u "jdbc:drill:schema=dfs.tmp"
> 0: jdbc:drill:schema=dfs.tmp> select * from `twoKeyJsn.json` limit 800;
> ...
> | 7.40907649723E8  | g|
> | 1.12378007695E9  | d|
> 03:03:28.482 [CuratorFramework-0] ERROR org.apache.curator.ConnectionState - 
> Connection timed out for connection string (10.10.100.201:5181) and timeout 
> (5000) / elapsed (5013)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
> ConnectionLoss
>   at 
> org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) 
> [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
>  [curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:807)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:793)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57)
>  [curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275)
>  [curator-framework-2.5.0.jar:na]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {code}
> Here is the stack for sqlline process
> {code}
> [root@centos-01 bin]# /usr/java/jdk1.7.0_45/bin/jstack 32136
> 2015-09-05 03:21:52
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x7f8328003800 nid=0x27f1 waiting on 
> condition [0x]
>java.lang.Thread.State: RUNNABLE
> "CuratorFramework-0-EventThread" daemon prio=10 tid=0x012fd800 
> nid=0x26e1 waiting on condition [0x7f8317c2e000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007e2117798> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>   at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:491)
> "CuratorFramework-0-SendThread(centos-01.qa.lab:5181)" daemon prio=10 
> 

<    1   2   3   4   5   6