[jira] [Updated] (IMPALA-8945) Impala Doc: Incorrect Claim of Equivalence in Impala Docs

2019-09-13 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8945:

Description: 
Reported by [~icook]

The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.

Another expression that is equivalent to A <=> B is:

if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)

This one is a bit easier to follow. If you use this one in the docs, just 
replace the following line with:

The <=> operator can use a hash join, while the if expression cannot.

  was:
The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.

Another expression that is equivalent to A <=> B is:

if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)

This one is a bit easier to follow. If you use this one in the docs, just 
replace the following line with:

The <=> operator can use a hash join, while the if expression cannot.


> Impala Doc: Incorrect Claim of Equivalence in Impala Docs
> -
>
> Key: IMPALA-8945
> URL: https://issues.apache.org/jira/browse/IMPALA-8945
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> Reported by [~icook]
> The Impala docs entry for the IS DISTINCT FROM operator states:
> The <=> operator, used like an equality operator in a join query, is more 
> efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
> <=> operator can use a hash join, while the OR expression cannot.
> But this expression is not equivalent to A <=> B. See the attached screenshot 
> demonstrating their non-equivalence. An expression that is equivalent to A 
> <=> B is this:
> (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))
>  This expression should replace the existing incorrect expression.
> Another expression that is equivalent to A <=> B is:
> if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)
> This one is a bit easier to follow. If you use this one in the docs, just 
> replace the following line with:
> The <=> operator can use a hash join, while the if expression cannot.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8945) Impala Doc: Incorrect Claim of Equivalence in Impala Docs

2019-09-13 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8945:

Description: 
The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.

Another expression that is equivalent to A <=> B is:

if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)

This one is a bit easier to follow. If you use this one in the docs, just 
replace the following line with:

The <=> operator can use a hash join, while the if expression cannot.

  was:
The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.


> Impala Doc: Incorrect Claim of Equivalence in Impala Docs
> -
>
> Key: IMPALA-8945
> URL: https://issues.apache.org/jira/browse/IMPALA-8945
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> The Impala docs entry for the IS DISTINCT FROM operator states:
> The <=> operator, used like an equality operator in a join query, is more 
> efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
> <=> operator can use a hash join, while the OR expression cannot.
> But this expression is not equivalent to A <=> B. See the attached screenshot 
> demonstrating their non-equivalence. An expression that is equivalent to A 
> <=> B is this:
> (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))
>  This expression should replace the existing incorrect expression.
> Another expression that is equivalent to A <=> B is:
> if(A IS NULL OR B IS NULL, A IS NULL AND B IS NULL, A = B)
> This one is a bit easier to follow. If you use this one in the docs, just 
> replace the following line with:
> The <=> operator can use a hash join, while the if expression cannot.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8945) Impala Doc: Incorrect Claim of Equivalence in Impala Docs

2019-09-13 Thread Alex Rodoni (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8945:

Description: 
The Impala docs entry for the IS DISTINCT FROM operator states:

The <=> operator, used like an equality operator in a join query, is more 
efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
<=> operator can use a hash join, while the OR expression cannot.

But this expression is not equivalent to A <=> B. See the attached screenshot 
demonstrating their non-equivalence. An expression that is equivalent to A <=> 
B is this:

(A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))

 This expression should replace the existing incorrect expression.

> Impala Doc: Incorrect Claim of Equivalence in Impala Docs
> -
>
> Key: IMPALA-8945
> URL: https://issues.apache.org/jira/browse/IMPALA-8945
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>
> The Impala docs entry for the IS DISTINCT FROM operator states:
> The <=> operator, used like an equality operator in a join query, is more 
> efficient than the equivalent clause: A = B OR (A IS NULL AND B IS NULL). The 
> <=> operator can use a hash join, while the OR expression cannot.
> But this expression is not equivalent to A <=> B. See the attached screenshot 
> demonstrating their non-equivalence. An expression that is equivalent to A 
> <=> B is this:
> (A IS NULL AND B IS NULL) OR ((A IS NOT NULL AND B IS NOT NULL) AND (A = B))
>  This expression should replace the existing incorrect expression.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8945) Impala Doc: Incorrect Claim of Equivalence in Impala Docs

2019-09-13 Thread Alex Rodoni (Jira)
Alex Rodoni created IMPALA-8945:
---

 Summary: Impala Doc: Incorrect Claim of Equivalence in Impala Docs
 Key: IMPALA-8945
 URL: https://issues.apache.org/jira/browse/IMPALA-8945
 Project: IMPALA
  Issue Type: Bug
  Components: Docs
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (IMPALA-8442) Clean up concurrency around query_status_

2019-09-13 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-8442:
-

Assignee: (was: Tim Armstrong)

> Clean up concurrency around query_status_
> -
>
> Key: IMPALA-8442
> URL: https://issues.apache.org/jira/browse/IMPALA-8442
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>
> The handling of concurrent access to ClientRequestState::query_status_ is 
> messy - it's exposed directly via query_status() and a lot of callers don't 
> hold ClientRequestState::lock_.
> This appears to be safe in many places for subtle reasons, e.g. because the 
> value won't be modified after a certain point in the query lifecycle, so it's 
> not dangerous to access it when logging an audit record. However this is 
> brittle and likely to lead with bugs if we change the code around it 
> signicantly.
> We could approach this in various ways, e.g. updating callers to consistently 
> acquire the lock and/or documenting invariants around when its safe to do so 
> without holding it. Or we could make Status thread-safe, or have a 
> thread-safe Status wrapper.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8932) impala shell shouldn't retry with kerberos when connecting over http

2019-09-13 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8932.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> impala shell shouldn't retry with kerberos when connecting over http
> 
>
> Key: IMPALA-8932
> URL: https://issues.apache.org/jira/browse/IMPALA-8932
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> {noformat}
> Error connecting: EOFError, 
> Kerberos ticket found in the credentials cache, retrying the connection with 
> a secure transport.
> Warning: --connect_timeout_ms is currently ignored with HTTP transport.
> Kerberos not supported with HTTP endpoints.
> Error connecting: NotImplementedError, 
> {noformat}
> The NotImplementedError is confusing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (IMPALA-8932) impala shell shouldn't retry with kerberos when connecting over http

2019-09-13 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8932.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> impala shell shouldn't retry with kerberos when connecting over http
> 
>
> Key: IMPALA-8932
> URL: https://issues.apache.org/jira/browse/IMPALA-8932
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
> Fix For: Impala 3.4.0
>
>
> {noformat}
> Error connecting: EOFError, 
> Kerberos ticket found in the credentials cache, retrying the connection with 
> a secure transport.
> Warning: --connect_timeout_ms is currently ignored with HTTP transport.
> Kerberos not supported with HTTP endpoints.
> Error connecting: NotImplementedError, 
> {noformat}
> The NotImplementedError is confusing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8932) impala shell shouldn't retry with kerberos when connecting over http

2019-09-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929506#comment-16929506
 ] 

ASF subversion and git services commented on IMPALA-8932:
-

Commit e070dbb02c264e26df576b40b8b792d02619fc60 in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e070dbb ]

IMPALA-8932: addendum - protocol var not defined

Change-Id: I75c41a02bc7f1314e48bb5a39b945119264ce478
Reviewed-on: http://gerrit.cloudera.org:8080/14225
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> impala shell shouldn't retry with kerberos when connecting over http
> 
>
> Key: IMPALA-8932
> URL: https://issues.apache.org/jira/browse/IMPALA-8932
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>
> {noformat}
> Error connecting: EOFError, 
> Kerberos ticket found in the credentials cache, retrying the connection with 
> a secure transport.
> Warning: --connect_timeout_ms is currently ignored with HTTP transport.
> Kerberos not supported with HTTP endpoints.
> Error connecting: NotImplementedError, 
> {noformat}
> The NotImplementedError is confusing.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-13 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929482#comment-16929482
 ] 

Sahil Takiar edited comment on IMPALA-8944 at 9/13/19 7:42 PM:
---

Looks like there was an attempt to only run fe tests beginning with {{S3*}} if 
the target filesystem is S3 - 
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L198 - 
however it looks like the logic is broken by 
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L205 - the 
second filter cancels the first out out, which is why this doesn't work. 


was (Author: stakiar):
Looks like there was an attempt to only run fe tests beginning with {{S3*}} the 
target filesystem is S3 - 
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L198 - 
however it looks like the logic is broken by 
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L205 - the 
second filter cancels the first out out, which is why this doesn't work. 

> Update and re-enable S3PlannerTest
> --
>
> Key: IMPALA-8944
> URL: https://issues.apache.org/jira/browse/IMPALA-8944
> Project: IMPALA
>  Issue Type: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. 
> When run against a HDFS mini-cluster, they are skipped because the 
> {{TARGET_FILESYSTEM}} is not S3. On our S3 jobs, they don't run either 
> because we skip all fe/ tests (most of them don't work against S3 / assume 
> they are running on HDFS).
> A few things need to be fixed to get this working:
> * The test cases in {{S3PlannerTest}} need to be fixed
> * The Jenkins jobs that runs the S3 tests needs the ability to run specific 
> fe/ tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-13 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929482#comment-16929482
 ] 

Sahil Takiar commented on IMPALA-8944:
--

Looks like there was an attempt to only run fe tests beginning with {{S3*}} the 
target filesystem is S3 - 
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L198 - 
however it looks like the logic is broken by 
https://github.com/apache/impala/blob/master/bin/run-all-tests.sh#L205 - the 
second filter cancels the first out out, which is why this doesn't work. 

> Update and re-enable S3PlannerTest
> --
>
> Key: IMPALA-8944
> URL: https://issues.apache.org/jira/browse/IMPALA-8944
> Project: IMPALA
>  Issue Type: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. 
> When run against a HDFS mini-cluster, they are skipped because the 
> {{TARGET_FILESYSTEM}} is not S3. On our S3 jobs, they don't run either 
> because we skip all fe/ tests (most of them don't work against S3 / assume 
> they are running on HDFS).
> A few things need to be fixed to get this working:
> * The test cases in {{S3PlannerTest}} need to be fixed
> * The Jenkins jobs that runs the S3 tests needs the ability to run specific 
> fe/ tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-13 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated IMPALA-8944:
-
Description: 
It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. When 
run against a HDFS mini-cluster, they are skipped because the 
{{TARGET_FILESYSTEM}} is not S3. On our S3 jobs, they don't run either because 
we skip all fe/ tests (most of them don't work against S3 / assume they are 
running on HDFS).

A few things need to be fixed to get this working:
* The test cases in {{S3PlannerTest}} need to be fixed
* The Jenkins jobs that runs the S3 tests needs the ability to run specific fe/ 
tests (e.g. just the {{S3PlannerTest}} and to skip the rest)

  was:
It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. When 
run against a HDFS mini-cluster, they are skipped by the {{TARGET_FILESYSTEM}} 
is not S3. On our S3 jobs, they don't run either because we skip all fe/ tests 
(most of them don't work against S3 / assume they are running on HDFS).

A few things need to be fixed to get this working:
* The test cases in {{S3PlannerTest}} need to be fixed
* The Jenkins jobs that runs the S3 tests needs the ability to run specific fe/ 
tests (e.g. just the {{S3PlannerTest}} and to skip the rest)


> Update and re-enable S3PlannerTest
> --
>
> Key: IMPALA-8944
> URL: https://issues.apache.org/jira/browse/IMPALA-8944
> Project: IMPALA
>  Issue Type: Test
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. 
> When run against a HDFS mini-cluster, they are skipped because the 
> {{TARGET_FILESYSTEM}} is not S3. On our S3 jobs, they don't run either 
> because we skip all fe/ tests (most of them don't work against S3 / assume 
> they are running on HDFS).
> A few things need to be fixed to get this working:
> * The test cases in {{S3PlannerTest}} need to be fixed
> * The Jenkins jobs that runs the S3 tests needs the ability to run specific 
> fe/ tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-13 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-8944:


 Summary: Update and re-enable S3PlannerTest
 Key: IMPALA-8944
 URL: https://issues.apache.org/jira/browse/IMPALA-8944
 Project: IMPALA
  Issue Type: Test
Reporter: Sahil Takiar
Assignee: Sahil Takiar


It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. When 
run against a HDFS mini-cluster, they are skipped by the {{TARGET_FILESYSTEM}} 
is not S3. On our S3 jobs, they don't run either because we skip all fe/ tests 
(most of them don't work against S3 / assume they are running on HDFS).

A few things need to be fixed to get this working:
* The test cases in {{S3PlannerTest}} need to be fixed
* The Jenkins jobs that runs the S3 tests needs the ability to run specific fe/ 
tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8944) Update and re-enable S3PlannerTest

2019-09-13 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-8944:


 Summary: Update and re-enable S3PlannerTest
 Key: IMPALA-8944
 URL: https://issues.apache.org/jira/browse/IMPALA-8944
 Project: IMPALA
  Issue Type: Test
Reporter: Sahil Takiar
Assignee: Sahil Takiar


It looks like we don't run {{S3PlannerTest}} in our regular Jenkins jobs. When 
run against a HDFS mini-cluster, they are skipped by the {{TARGET_FILESYSTEM}} 
is not S3. On our S3 jobs, they don't run either because we skip all fe/ tests 
(most of them don't work against S3 / assume they are running on HDFS).

A few things need to be fixed to get this working:
* The test cases in {{S3PlannerTest}} need to be fixed
* The Jenkins jobs that runs the S3 tests needs the ability to run specific fe/ 
tests (e.g. just the {{S3PlannerTest}} and to skip the rest)



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IMPALA-8942) Set file format specific values for split sizes on non-block stores

2019-09-13 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929451#comment-16929451
 ] 

Sahil Takiar commented on IMPALA-8942:
--

The fix for this is relatively simple, but testing with the current infra seems 
a bit tricky. Ideally, {{S3PlannerTest}} would work here, but it seems that 
class rotted since we don't run it on a regular basis.

The alternative is write a one-off unit tests using mocks, but it would be 
better if we used {{S3PlannerTest}}.

> Set file format specific values for split sizes on non-block stores
> ---
>
> Key: IMPALA-8942
> URL: https://issues.apache.org/jira/browse/IMPALA-8942
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Parquet scans on non-block based storage systems (e.g. S3, ADLS, etc.) can 
> suffer from uneven scan range assignment due to the behavior described in 
> IMPALA-3453. The frontend should set different split sizes depending on the 
> file type and file system.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8634) Catalog client should be resilient to temporary Catalog outage

2019-09-13 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929371#comment-16929371
 ] 

Sahil Takiar commented on IMPALA-8634:
--

Actually I think we just do the exact same thing as IMPALA-8904 it should all 
work.

> Catalog client should be resilient to temporary Catalog outage
> --
>
> Key: IMPALA-8634
> URL: https://issues.apache.org/jira/browse/IMPALA-8634
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Sahil Takiar
>Priority: Critical
>
> Currently, when the catalog server is down, catalog clients will fail all 
> RPCs sent to it. In essence, DDL queries will fail and the Impala service 
> becomes a lot less functional. Catalog clients should consider retrying 
> failed RPCs with some exponential backoff in between while catalog server is 
> being restarted after crashing. We probably need to add [a test 
> |https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py]
>  to exercise the paths of catalog restart to verify coordinators are 
> resilient to it.
> cc'ing [~stakiar], [~joemcdonnell], [~twm378]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8634) Catalog client should be resilient to temporary Catalog outage

2019-09-13 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929353#comment-16929353
 ] 

Sahil Takiar commented on IMPALA-8634:
--

The existing code actually already does this. The flags 
{{catalog_client_connection_num_retries}} and 
{{catalog_client_rpc_retry_interval_ms}} control the number of times the client 
tries to re-connect to the catalog.

The issue is that connection established is retried, but individual RPCs are 
not retried (unless the RPC hits a connection reset). So a fix would to use 
{{DoRpcWithRetry}} instead of {{DoRpc}} (similar to what was done in 
IMPALA-8904).

There is some odd behavior with the retry logic though. If there is a cached 
client connection, the catalogd crashes, and then a query runs, the impalad 
will retry the connection {{2 * catalog_client_connection_num_retries}} times 
because the RPC is retried and the connection established is retried. One way 
to fix this would be to remove the connection establishment retry and let the 
RPC retry handle all retries. The issue is that the way the code is written, 
that means any attempt to establish a new connection won't be retried (if it 
uses a cached connection it will be retried).

Ideally, the following scenarios are handled correctly (e.g. each are retried 
exactly {{catalog_client_connection_num_retries}} times):
* New connection establishment
* Cached connection resets
* RPC failures

Would be nice to rename {{catalog_client_connection_num_retries}} to 
{{catalog_client_rpc_num_retries}} as well.

> Catalog client should be resilient to temporary Catalog outage
> --
>
> Key: IMPALA-8634
> URL: https://issues.apache.org/jira/browse/IMPALA-8634
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Michael Ho
>Assignee: Sahil Takiar
>Priority: Critical
>
> Currently, when the catalog server is down, catalog clients will fail all 
> RPCs sent to it. In essence, DDL queries will fail and the Impala service 
> becomes a lot less functional. Catalog clients should consider retrying 
> failed RPCs with some exponential backoff in between while catalog server is 
> being restarted after crashing. We probably need to add [a test 
> |https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py]
>  to exercise the paths of catalog restart to verify coordinators are 
> resilient to it.
> cc'ing [~stakiar], [~joemcdonnell], [~twm378]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org