date:20190129

[jira] [Resolved] (IMPALA-691) Process mem limit does not account for the JVM's memory usage

2019-01-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-691.
--
Resolution: Fixed

https://gerrit.cloudera.org/#/c/12262/ actually fixed this for containers 
already by changing the default to include the JVM memory.

> Process mem limit does not account for the JVM's memory usage
> -
>
> Key: IMPALA-691
> URL: https://issues.apache.org/jira/browse/IMPALA-691
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 1.2.1, Impala 2.0, Impala 2.1, Impala 2.2, Impala 
> 2.3.0
>Reporter: Skye Wanderman-Milne
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: incompatibility, resource-management
>
> The JVM doesn't appear to use malloc, so it's memory usage is not reported by 
> tcmalloc and we do not count it in the process mem limit. I verified this by 
> adding a large allocation in the FE, and noting that the total memory usage 
> (virtual or resident) reported in /memz is not affected, but the virtual and 
> resident memory usage reported by top is.
> This is problematic especially because Impala caches table metadata in the FE 
> (JVM) which can become quite big (few GBs) in extreme cases.
> *Workaround*
> As a workaround, we recommend reducing the process memory limit by 1-2GB to 
> "reserve" memory for the JVM. How much memory you should reserve typically 
> depends on the size of your catalog ( number of 
> tables/partitions/columns/blocks etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Reopened] (IMPALA-691) Process mem limit does not account for the JVM's memory usage

2019-01-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reopened IMPALA-691:
--

> Process mem limit does not account for the JVM's memory usage
> -
>
> Key: IMPALA-691
> URL: https://issues.apache.org/jira/browse/IMPALA-691
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 1.2.1, Impala 2.0, Impala 2.1, Impala 2.2, Impala 
> 2.3.0
>Reporter: Skye Wanderman-Milne
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: incompatibility, resource-management
>
> The JVM doesn't appear to use malloc, so it's memory usage is not reported by 
> tcmalloc and we do not count it in the process mem limit. I verified this by 
> adding a large allocation in the FE, and noting that the total memory usage 
> (virtual or resident) reported in /memz is not affected, but the virtual and 
> resident memory usage reported by top is.
> This is problematic especially because Impala caches table metadata in the FE 
> (JVM) which can become quite big (few GBs) in extreme cases.
> *Workaround*
> As a workaround, we recommend reducing the process memory limit by 1-2GB to 
> "reserve" memory for the JVM. How much memory you should reserve typically 
> depends on the size of your catalog ( number of 
> tables/partitions/columns/blocks etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Issue Comment Deleted] (IMPALA-691) Process mem limit does not account for the JVM's memory usage

2019-01-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-691:
-
Comment: was deleted

(was: https://gerrit.cloudera.org/#/c/12262/ actually fixed this for containers 
already by changing the default to include the JVM memory.)

> Process mem limit does not account for the JVM's memory usage
> -
>
> Key: IMPALA-691
> URL: https://issues.apache.org/jira/browse/IMPALA-691
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 1.2.1, Impala 2.0, Impala 2.1, Impala 2.2, Impala 
> 2.3.0
>Reporter: Skye Wanderman-Milne
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: incompatibility, resource-management
>
> The JVM doesn't appear to use malloc, so it's memory usage is not reported by 
> tcmalloc and we do not count it in the process mem limit. I verified this by 
> adding a large allocation in the FE, and noting that the total memory usage 
> (virtual or resident) reported in /memz is not affected, but the virtual and 
> resident memory usage reported by top is.
> This is problematic especially because Impala caches table metadata in the FE 
> (JVM) which can become quite big (few GBs) in extreme cases.
> *Workaround*
> As a workaround, we recommend reducing the process memory limit by 1-2GB to 
> "reserve" memory for the JVM. How much memory you should reserve typically 
> depends on the size of your catalog ( number of 
> tables/partitions/columns/blocks etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7940) Automatically set process memory limit to a good value when running in a container

2019-01-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7940.
---
Resolution: Duplicate

IMPALA-7941 changed the default for containers

> Automatically set process memory limit to a good value when running in a 
> container
> --
>
> Key: IMPALA-7940
> URL: https://issues.apache.org/jira/browse/IMPALA-7940
> Project: IMPALA
>  Issue Type: Epic
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: admission-control, resource-management
>
> It would be convenient if, when starting up an Impala daemon in a container, 
> it could automatically configure its own memory limit to the right value so 
> that queries won't be overadmitted. It should sniff out the memory limit for 
> the container (if there is one set).
> E.g. see what Java did https://www.opsian.com/blog/java-on-docker/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8135) Bump maven-surefire-plugin version to at least 2.19 to support running a single parameterized test

2019-01-29 Thread Quanlong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-8135:
--

Assignee: Quanlong Huang

> Bump maven-surefire-plugin version to at least 2.19 to support running a 
> single parameterized test
> --
>
> Key: IMPALA-8135
> URL: https://issues.apache.org/jira/browse/IMPALA-8135
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>
> Our current version of maven-surefire-plugin is 2.18 which does not support 
> running a single parameterized test. For example, AuthorizationTest is a 
> parameterized test class to support tests on file based auth and sentry based 
> auth together. Currently, we can't run a single test method like:
> {code:java}
> (pushd fe && mvn test 
> -Dtest=AuthorizationTest#TestDescribeTableResults[*]){code}
> I'm in branch-2.x. My head is e288128ba and I'm cherry-picking 9282fa3ba. 
> AuthorizationTest#TestDescribeTableResults is the only test failures needed 
> to fix.
> We need to upgrade maven-surefire-plugin to at least 2.19 to support this: 
> http://maven.apache.org/surefire/maven-surefire-plugin/examples/single-test.html#Multiple_Formats_in_One



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-2998) impala-shell -B and --output_delimiter does not work if string contains delimiter or TABs

2019-01-29 Thread David Palmer (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755008#comment-16755008
 ] 

David Palmer commented on IMPALA-2998:
--

Sorry, I'm confused.  I've experienced this behaviour with impala-shell using  
impalad version 2.11.0-cdh5.14.2 and I've read through the linked issues.  Is 
it fixed (i.e. impala pointing to hiveserver2) in a later version, or is this 
something that will not be fixed in impala-shell.  It seems to be ok in Impala 
in Hue.

> impala-shell -B and --output_delimiter does not work if string contains 
> delimiter or TABs
> -
>
> Key: IMPALA-2998
> URL: https://issues.apache.org/jira/browse/IMPALA-2998
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.3.0
>Reporter: Eric Lin
>Priority: Major
>  Labels: impala-shell
>
> See the test case below:
> {code}
> impala-shell -q "DROP TABLE IF EXISTS tabtest";
> impala-shell -q "CREATE TABLE tabtest(col1 string,col2 string) ROW FORMAT 
> DELIMITED FIELDS TERMINATED by ','";
> impala-shell -q 'INSERT OVERWRITE TABLE tabtest VALUES ("test", 
> "\t\t\tTest"), ("test2", "Test\t\t\tTest"), ("test3", "test\tTest"), 
> ("test4", "test,Test");';
> impala-shell -o out.csv -q "SELECT * FROM tabtest" --output_delimiter="," -B
> cat out.csv
> {code}
> The output looks like below:
> {code}
> testTest
> test2,Test,,,Test
> test3,test,Test
> test4,test
> {code}
> So two issues I can see here:
> 1. When strings contain TABs, all tabs will be replaced by delimiter
> 2. If string contains delimiter, the data after the delimiter is lost (see 
> "test4"). According to doc: 
> http://www.cloudera.com/documentation/enterprise/latest/topics/impala_shell_options.html,
>  
> {quote}
> If an output value contains the delimiter character, that field is quoted 
> and/or escaped
> {quote}
> By looking at the underlining data:
> {code}
> hadoop fs -cat 
> /user/hive/warehouse/tabtest/ba44046cba4c7c80-d6c4c08afd8c0cb0_1055158928_data.0.
> test, Test
> test2,TestTest
> test3,testTest
> test4,test,Test
> {code}
> Data is not stored properly, as they should be in quotes for those strings 
> that contains delimter characters.
> This is both data write as well as read/parse issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8136) Investigate NoClassDefFoundError when running maven tests

2019-01-29 Thread Quanlong Huang (JIRA)

Quanlong Huang created IMPALA-8136:
--

 Summary: Investigate NoClassDefFoundError when running maven tests
 Key: IMPALA-8136
 URL: https://issues.apache.org/jira/browse/IMPALA-8136
 Project: IMPALA
  Issue Type: Task
Reporter: Quanlong Huang


I encountered a NoClassDefFoundError when running maven tests. It's resolved 
unexpectedly by restarting the mini-cluster. My operations are:
 # ./buildall.sh in a clean clone
 # kill the processes after backend tests and frontend tests finish
 # run FE tests again like
{code:java}
(push fe && mvn test -Dtest=AuthorizationTest){code}

Then I encountered the following error:
{code:java}
Running org.apache.impala.analysis.AuthorizationTest
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec <<< 
FAILURE! - in org.apache.impala.analysis.AuthorizationTest
initializationError(org.apache.impala.analysis.AuthorizationTest)  Time 
elapsed: 0.006 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.impala.analysis.AuthorizationTest
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.runners.Parameterized.allParameters(Parameterized.java:280)
    at org.junit.runners.Parameterized.(Parameterized.java:248)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at 
org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
    at 
org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
    at 
org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
    at 
org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
    at 
org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
    at 
org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
    at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
    at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
    at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Results :
Tests in error:
  AuthorizationTest.initializationError » NoClassDefFound Could not initialize 
c...
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 19.717 s
[INFO] Finished at: 2019-01-29T03:38:16-08:00
[INFO] Final Memory: 116M/2090M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.18:test (default-test) on 
project impala-frontend: There are test failures.
[ERROR]
[ERROR] Please refer to /tmp/jenkins/workspace/impala-hulu/logs/fe_tests for 
the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
{code}
I feel like something wrong in pom.xml since it can compile successfully but 
fail at running. After I restarted the mini-cluster by testdata/bin/run-all.sh, 
the error was resolved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-

[jira] [Created] (IMPALA-8137) Order by docs incorrectly state that order by happens on one node

2019-01-29 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-8137:
-

 Summary: Order by docs incorrectly state that order by happens on 
one node
 Key: IMPALA-8137
 URL: https://issues.apache.org/jira/browse/IMPALA-8137
 Project: IMPALA
  Issue Type: Documentation
  Components: Docs
Reporter: Tim Armstrong
Assignee: Alex Rodoni


https://impala.apache.org/docs/build/html/topics/impala_order_by.html

"because the entire result set must be produced and transferred to one node 
before the sorting can happen." is incorrect. If there is an "ORDER BY" clause 
in a select block, then first data is sorted locally by each impala daemon, 
then streamed to the coordinator, which merges the sorted result sets.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8137) Order by docs incorrectly state that order by happens on one node

2019-01-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-8137:
--
Target Version: Impala 3.2.0

> Order by docs incorrectly state that order by happens on one node
> -
>
> Key: IMPALA-8137
> URL: https://issues.apache.org/jira/browse/IMPALA-8137
> Project: IMPALA
>  Issue Type: Documentation
>  Components: Docs
>Reporter: Tim Armstrong
>Assignee: Alex Rodoni
>Priority: Major
>
> https://impala.apache.org/docs/build/html/topics/impala_order_by.html
> "because the entire result set must be produced and transferred to one node 
> before the sorting can happen." is incorrect. If there is an "ORDER BY" 
> clause in a select block, then first data is sorted locally by each impala 
> daemon, then streamed to the coordinator, which merges the sorted result sets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-8097) Experimental flag for running all queries with mt_dop

2019-01-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8097 started by Tim Armstrong.
-
> Experimental flag for running all queries with mt_dop
> -
>
> Key: IMPALA-8097
> URL: https://issues.apache.org/jira/browse/IMPALA-8097
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> It's possible to execute queries with joins and inserts with mt_dop with some 
> small modifications to the Impala code (without the separate join build 
> work). This isn't production-ready because of the other subtasks of 
> IMPALA-3902 but it would be useful to have an experimental flag for people to 
> play around with to test the functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8136) Investigate NoClassDefFoundError when running maven tests

2019-01-29 Thread Fredy Wijaya (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755201#comment-16755201
 ] 

Fredy Wijaya commented on IMPALA-8136:
--

[~stiga-huang] is this an error in 2.x branch of master? The reason for 
NoClassDefFoundError is probably due to an exception thrown in the static 
block: 
https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java#L139-L160
 that 

> Investigate NoClassDefFoundError when running maven tests
> -
>
> Key: IMPALA-8136
> URL: https://issues.apache.org/jira/browse/IMPALA-8136
> Project: IMPALA
>  Issue Type: Task
>Reporter: Quanlong Huang
>Priority: Major
>
> I encountered a NoClassDefFoundError when running maven tests. It's resolved 
> unexpectedly by restarting the mini-cluster. My operations are:
>  # ./buildall.sh in a clean clone
>  # kill the processes after backend tests and frontend tests finish
>  # run FE tests again like
> {code:java}
> (push fe && mvn test -Dtest=AuthorizationTest){code}
> Then I encountered the following error:
> {code:java}
> Running org.apache.impala.analysis.AuthorizationTest
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec <<< 
> FAILURE! - in org.apache.impala.analysis.AuthorizationTest
> initializationError(org.apache.impala.analysis.AuthorizationTest)  Time 
> elapsed: 0.006 sec  <<< ERROR!
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.impala.analysis.AuthorizationTest
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>     at 
> org.junit.runners.Parameterized.allParameters(Parameterized.java:280)
>     at org.junit.runners.Parameterized.(Parameterized.java:248)
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at 
> org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
>     at 
> org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
>     at 
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
>     at 
> org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
>     at 
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
>     at 
> org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>     at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>     at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>     at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Results :
> Tests in error:
>   AuthorizationTest.initializationError » NoClassDefFound Could not 
> initialize c...
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 19.717 s
> [INFO] Finished at: 2019-01-29T03:38:16-08:00
> [INFO] Final Memory: 116M/2090M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.18:test (default-test) on 
> project impala-frontend: There are test failures.
> [ERROR]
> [ERROR] Please refer to /tmp/jenkins/work

[jira] [Updated] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error

2019-01-29 Thread Andrew Sherman (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated IMPALA-8112:
---
Priority: Major  (was: Critical)

> test_cancel_select with debug action failed with unexpected error
> -
>
> Key: IMPALA-8112
> URL: https://issues.apache.org/jira/browse/IMPALA-8112
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Andrew Sherman
>Priority: Major
>  Labels: flaky
>
> Stacktrace
> {noformat}
> query_test/test_cancellation.py:241: in test_cancel_select
> self.execute_cancel_test(vector)
> query_test/test_cancellation.py:213: in execute_cancel_test
> assert 'Cancelled' in str(thread.fetch_results_error)
> E   assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n"
> E+  where "ImpalaBeeswaxException:\n INNER EXCEPTION:  'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: 
> Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected 
> (error 107)\n" = str(ImpalaBeeswaxException())
> E+where ImpalaBeeswaxException() =  140481071658752)>.fetch_results_error
> {noformat}
> Standard Error
> {noformat}
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- executing against localhost:21000
> use tpch_kudu;
> -- 2019-01-18 17:50:03,100 INFO MainThread: Started query 
> 4e4b3ab4cc7d:11efc3f5
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET cpu_limit_s=10;
> SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL;
> SET exec_single_node_rows_threshold=0;
> SET buffer_pool_limit=0;
> -- executing async: localhost:21000
> select l_returnflag from lineitem;
> -- 2019-01-18 17:50:03,139 INFO MainThread: Started query 
> fa4ddb9e62a01240:54c86ad
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action;
> -- connecting to: localhost:21000
> -- fetching results from:  object at 0x6235e90>
> -- getting state for operation: 
> 
> -- canceling operation:  object at 0x6235e90>
> -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection 
> (1): localhost
> -- closing query for operation handle: 
> 
> {noformat}
> [~asherman] please take a look since it looks like you touched code around 
> this area last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8138) Re-introduce rpc debugging options

2019-01-29 Thread Thomas Tauber-Marshall (JIRA)

Thomas Tauber-Marshall created IMPALA-8138:
--

 Summary: Re-introduce rpc debugging options
 Key: IMPALA-8138
 URL: https://issues.apache.org/jira/browse/IMPALA-8138
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: Thomas Tauber-Marshall
Assignee: Thomas Tauber-Marshall


In the past, we had fault injection options for backend rpcs implemented in 
ImpalaBackendClient. With the move the krpc, we lost some of those options. We 
should re-introduce an equivalent mechanism for our backend krpc calls to make 
it easy to simulate various rpc failure scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8137) Order by docs incorrectly state that order by happens on one node

2019-01-29 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8137:

Issue Type: Bug  (was: Documentation)

> Order by docs incorrectly state that order by happens on one node
> -
>
> Key: IMPALA-8137
> URL: https://issues.apache.org/jira/browse/IMPALA-8137
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Tim Armstrong
>Assignee: Alex Rodoni
>Priority: Major
>
> https://impala.apache.org/docs/build/html/topics/impala_order_by.html
> "because the entire result set must be produced and transferred to one node 
> before the sorting can happen." is incorrect. If there is an "ORDER BY" 
> clause in a select block, then first data is sorted locally by each impala 
> daemon, then streamed to the coordinator, which merges the sorted result sets.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8138) Re-introduce rpc debugging options

2019-01-29 Thread Michael Ho (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755333#comment-16755333
 ] 

Michael Ho commented on IMPALA-8138:


We should consider switching to using DebugAction for the fault injection as 
it's more generic instead of having two mechanisms.

> Re-introduce rpc debugging options
> --
>
> Key: IMPALA-8138
> URL: https://issues.apache.org/jira/browse/IMPALA-8138
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> In the past, we had fault injection options for backend rpcs implemented in 
> ImpalaBackendClient. With the move the krpc, we lost some of those options. 
> We should re-introduce an equivalent mechanism for our backend krpc calls to 
> make it easy to simulate various rpc failure scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8138) Re-introduce rpc debugging options

2019-01-29 Thread Michael Ho (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Ho updated IMPALA-8138:
---
Component/s: Distributed Exec

> Re-introduce rpc debugging options
> --
>
> Key: IMPALA-8138
> URL: https://issues.apache.org/jira/browse/IMPALA-8138
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Distributed Exec
>Affects Versions: Impala 3.2.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> In the past, we had fault injection options for backend rpcs implemented in 
> ImpalaBackendClient. With the move the krpc, we lost some of those options. 
> We should re-introduce an equivalent mechanism for our backend krpc calls to 
> make it easy to simulate various rpc failure scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8134) Update docs to reflect CGroups memory limit changes

2019-01-29 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8134:

Labels: future_release_doc in_32  (was: )

> Update docs to reflect CGroups memory limit changes
> ---
>
> Key: IMPALA-8134
> URL: https://issues.apache.org/jira/browse/IMPALA-8134
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: future_release_doc, in_32
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8139) Report DML stats incrementally

2019-01-29 Thread Thomas Tauber-Marshall (JIRA)

Thomas Tauber-Marshall created IMPALA-8139:
--

 Summary: Report DML stats incrementally
 Key: IMPALA-8139
 URL: https://issues.apache.org/jira/browse/IMPALA-8139
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend, Distributed Exec
Affects Versions: Impala 3.2.0
Reporter: Thomas Tauber-Marshall


Impala collects some stats related to dml execution. Currently, these are 
reported back to the coordinator (in  a DmlExecStatusPB) only with the final 
status report, as its tricky to report them in an idempotent way.

With IMPALA-4555, we're introducing functionality for portions of the status 
report to be non-idempotent. We can use this mechanism to report the dml stats 
incrementally during query execution, instead of once at the end, which is 
useful for user visibility into the status of running queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8140) Grouping aggregation with limit breaks asan build

2019-01-29 Thread Lars Volker (JIRA)

Lars Volker created IMPALA-8140:
---

 Summary: Grouping aggregation with limit breaks asan build
 Key: IMPALA-8140
 URL: https://issues.apache.org/jira/browse/IMPALA-8140
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0, Impala 3.2.0
Reporter: Lars Volker
Assignee: Lars Volker


Commit 4af3a7853e9 for IMPALA-7333 breaks the following query on ASAN:

{code:sql}
select count(*) from tpch_parquet.orders o group by o.o_clerk limit 10;
{code}

{noformat}
==30219==ERROR: AddressSanitizer: use-after-poison on address 0x631000c4569c at 
pc 0x020163cc bp 0x7f73a12a5700 sp 0x7f73a12a56f8
READ of size 1 at 0x631000c4569c thread T276
#0 0x20163cb in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
const /tmp/be/src/runtime/tuple.h:241:13
#1 0x280c3d1 in impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
impala::SlotDescriptor const&, impala::Tuple*, void*) 
/tmp/be/src/exprs/agg-fn-evaluator.cc:393:29
#2 0x2777bc8 in 
impala::AggFnEvaluator::Finalize(std::vector > const&, impala::Tuple*, 
impala::Tuple*) /tmp/be/src/exprs/agg-fn-evaluator.h:307:15
#3 0x27add96 in 
impala::GroupingAggregator::CleanupHashTbl(std::vector > const&, impala::HashTable::Iterator) 
/tmp/be/src/exec/grouping-aggregator.cc:351:7
#4 0x27ae2b2 in impala::GroupingAggregator::ClosePartitions() 
/tmp/be/src/exec/grouping-aggregator.cc:930:5
#5 0x27ae5f4 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
/tmp/be/src/exec/grouping-aggregator.cc:383:3
#6 0x27637f7 in impala::AggregationNode::Close(impala::RuntimeState*) 
/tmp/be/src/exec/aggregation-node.cc:139:32
#7 0x206b7e9 in impala::FragmentInstanceState::Close() 
/tmp/be/src/runtime/fragment-instance-state.cc:368:42
#8 0x2066b1a in impala::FragmentInstanceState::Exec() 
/tmp/be/src/runtime/fragment-instance-state.cc:99:3
#9 0x2080e12 in 
impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
/tmp/be/src/runtime/query-state.cc:584:24
#10 0x1d79036 in boost::function0::operator()() const 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
#11 0x24bbe06 in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function, impala::ThreadDebugInfo const*, 
impala::Promise*) /tmp/be/src/util/thread.cc:359:3
#12 0x24c72f8 in void boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> 
>::operator(), impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, void 
(*&)(std::string const&, std::string const&, boost::function, 
impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, int) 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
#13 0x24c714b in boost::_bi::bind_t, impala::ThreadDebugInfo const*, 
impala::Promise*), 
boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> > 
>::operator()() 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20:16
#14 0x3c83949 in thread_proxy 
(/home/lv/i4/be/build/debug/service/impalad+0x3c83949)
#15 0x7f768ce73183 in start_thread 
/build/eglibc-ripdx6/eglibc-2.19/nptl/pthread_create.c:312
#16 0x7f768c98a03c in clone 
/build/eglibc-ripdx6/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:111
{noformat}

The problem seems to be that we call 
{{output_partition_->aggregated_row_stream->Close()}} in 
be/src/exec/grouping-aggregator.cc:284 when hitting the limit, and then later 
the tuple creation in {{CleanupHashTbl()}} in 
be/src/exec/grouping-aggregator.cc:341 reads from poisoned memory.

A similar query does not show the crash:
{code:sql}
select count(*) from functional_parquet.alltypes a group by a.string_col limit 
2;
{code}

[~tarmstrong] - Do you have an idea why the query on a much smaller dataset 
wouldn't crash?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8140) Grouping aggregation with limit breaks asan build

2019-01-29 Thread Lars Volker (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-8140:

Priority: Blocker  (was: Major)

> Grouping aggregation with limit breaks asan build
> -
>
> Key: IMPALA-8140
> URL: https://issues.apache.org/jira/browse/IMPALA-8140
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Blocker
>  Labels: asan, crash
>
> Commit 4af3a7853e9 for IMPALA-7333 breaks the following query on ASAN:
> {code:sql}
> select count(*) from tpch_parquet.orders o group by o.o_clerk limit 10;
> {code}
> {noformat}
> ==30219==ERROR: AddressSanitizer: use-after-poison on address 0x631000c4569c 
> at pc 0x020163cc bp 0x7f73a12a5700 sp 0x7f73a12a56f8
> READ of size 1 at 0x631000c4569c thread T276
> #0 0x20163cb in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
> const /tmp/be/src/runtime/tuple.h:241:13
> #1 0x280c3d1 in 
> impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
> impala::SlotDescriptor const&, impala::Tuple*, void*) 
> /tmp/be/src/exprs/agg-fn-evaluator.cc:393:29
> #2 0x2777bc8 in 
> impala::AggFnEvaluator::Finalize(std::vector std::allocator > const&, impala::Tuple*, 
> impala::Tuple*) /tmp/be/src/exprs/agg-fn-evaluator.h:307:15
> #3 0x27add96 in 
> impala::GroupingAggregator::CleanupHashTbl(std::vector  std::allocator > const&, 
> impala::HashTable::Iterator) /tmp/be/src/exec/grouping-aggregator.cc:351:7
> #4 0x27ae2b2 in impala::GroupingAggregator::ClosePartitions() 
> /tmp/be/src/exec/grouping-aggregator.cc:930:5
> #5 0x27ae5f4 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
> /tmp/be/src/exec/grouping-aggregator.cc:383:3
> #6 0x27637f7 in impala::AggregationNode::Close(impala::RuntimeState*) 
> /tmp/be/src/exec/aggregation-node.cc:139:32
> #7 0x206b7e9 in impala::FragmentInstanceState::Close() 
> /tmp/be/src/runtime/fragment-instance-state.cc:368:42
> #8 0x2066b1a in impala::FragmentInstanceState::Exec() 
> /tmp/be/src/runtime/fragment-instance-state.cc:99:3
> #9 0x2080e12 in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> /tmp/be/src/runtime/query-state.cc:584:24
> #10 0x1d79036 in boost::function0::operator()() const 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #11 0x24bbe06 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /tmp/be/src/util/thread.cc:359:3
> #12 0x24c72f8 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #13 0x24c714b in boost::_bi::bind_t std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > 
> >::operator()() 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20:16
> #14 0x3c83949 in thread_proxy 
> (/home/lv/i4/be/build/debug/service/impalad+0x3c83949)
> #15 0x7f768ce73183 in start_thread 
> /build/eglibc-ripdx6/eglibc-2.19/nptl/pthread_create.c:312
> #16 0x7f768c98a03c in clone 
> /build/eglibc-ripdx6/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> {noformat}
> The problem seems to be that we call 
> {{output_partition_->aggregated_row_stream->Close()}} in 
> be/src/exec/grouping-aggregator.cc:284 when hitting the limit, and then later 
> the tuple creation in {{CleanupHashTbl()}} in 
> be/src/exec/grouping-aggregator.cc:341 reads from poisoned memory.
> A similar query does not show the crash:
> {code:sql}
> select count(*) from functional_parquet.alltypes a group by a.string_col 
> limit 2;
> {code}
> [~tarmstrong] - Do you have an idea why the query on a much smaller dataset 
> wouldn't crash?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8140) Grouping aggregation with limit breaks asan build

2019-01-29 Thread Lars Volker (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755398#comment-16755398
 ] 

Lars Volker commented on IMPALA-8140:
-

Making this a P1 since IMPALA-7731 added a query that exposes this problem and 
our Asan test are currently broken.

> Grouping aggregation with limit breaks asan build
> -
>
> Key: IMPALA-8140
> URL: https://issues.apache.org/jira/browse/IMPALA-8140
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0, Impala 3.2.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Blocker
>  Labels: asan, crash
>
> Commit 4af3a7853e9 for IMPALA-7333 breaks the following query on ASAN:
> {code:sql}
> select count(*) from tpch_parquet.orders o group by o.o_clerk limit 10;
> {code}
> {noformat}
> ==30219==ERROR: AddressSanitizer: use-after-poison on address 0x631000c4569c 
> at pc 0x020163cc bp 0x7f73a12a5700 sp 0x7f73a12a56f8
> READ of size 1 at 0x631000c4569c thread T276
> #0 0x20163cb in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
> const /tmp/be/src/runtime/tuple.h:241:13
> #1 0x280c3d1 in 
> impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
> impala::SlotDescriptor const&, impala::Tuple*, void*) 
> /tmp/be/src/exprs/agg-fn-evaluator.cc:393:29
> #2 0x2777bc8 in 
> impala::AggFnEvaluator::Finalize(std::vector std::allocator > const&, impala::Tuple*, 
> impala::Tuple*) /tmp/be/src/exprs/agg-fn-evaluator.h:307:15
> #3 0x27add96 in 
> impala::GroupingAggregator::CleanupHashTbl(std::vector  std::allocator > const&, 
> impala::HashTable::Iterator) /tmp/be/src/exec/grouping-aggregator.cc:351:7
> #4 0x27ae2b2 in impala::GroupingAggregator::ClosePartitions() 
> /tmp/be/src/exec/grouping-aggregator.cc:930:5
> #5 0x27ae5f4 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
> /tmp/be/src/exec/grouping-aggregator.cc:383:3
> #6 0x27637f7 in impala::AggregationNode::Close(impala::RuntimeState*) 
> /tmp/be/src/exec/aggregation-node.cc:139:32
> #7 0x206b7e9 in impala::FragmentInstanceState::Close() 
> /tmp/be/src/runtime/fragment-instance-state.cc:368:42
> #8 0x2066b1a in impala::FragmentInstanceState::Exec() 
> /tmp/be/src/runtime/fragment-instance-state.cc:99:3
> #9 0x2080e12 in 
> impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> /tmp/be/src/runtime/query-state.cc:584:24
> #10 0x1d79036 in boost::function0::operator()() const 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:766:14
> #11 0x24bbe06 in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) 
> /tmp/be/src/util/thread.cc:359:3
> #12 0x24c72f8 in void boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void (*&)(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), boost::_bi::list0&, int) 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525:9
> #13 0x24c714b in boost::_bi::bind_t std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > 
> >::operator()() 
> /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20:16
> #14 0x3c83949 in thread_proxy 
> (/home/lv/i4/be/build/debug/service/impalad+0x3c83949)
> #15 0x7f768ce73183 in start_thread 
> /build/eglibc-ripdx6/eglibc-2.19/nptl/pthread_create.c:312
> #16 0x7f768c98a03c in clone 
> /build/eglibc-ripdx6/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> {noformat}
> The problem seems to be that we call 
> {{output_partition_->aggregated_row_stream->Close()}} in 
> be/src/exec/grouping-aggregator.cc:284 when hitting the limit, and then later 
> the tuple creation in {{CleanupHashTbl()}} in 
> be/src/exec/grouping-aggregator.cc:341 reads from poisoned memory.
> A similar query does not show the crash:
> {code:sql}
> select count(*) from functional_parquet.alltypes a group by a.string_col 
> limit 2;
> {code}
> [~tarmstrong] - Do you have an idea why the query on a much smaller dataset 
> wouldn't crash?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8134) Update docs to reflect CGroups memory limit changes

2019-01-29 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-8134:

Description: https://gerrit.cloudera.org/#/c/12293/

> Update docs to reflect CGroups memory limit changes
> ---
>
> Key: IMPALA-8134
> URL: https://issues.apache.org/jira/browse/IMPALA-8134
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: future_release_doc, in_32
>
> https://gerrit.cloudera.org/#/c/12293/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8141) ASAN build failure in query_test/test_mem_usage_scaling.py

2019-01-29 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-8141:
---

 Summary: ASAN build failure in query_test/test_mem_usage_scaling.py
 Key: IMPALA-8141
 URL: https://issues.apache.org/jira/browse/IMPALA-8141
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Lenisha Gandhi
 Fix For: Impala 3.1.0


>From the build log:

{noformat}
23:18:34 query_test/test_mem_usage_scaling.py:202: in test_low_mem_limit_q16
23:18:34 self.low_memory_limit_test(vector, 'tpch-q16', 
self.MIN_MEM_FOR_TPCH['Q16'])
23:18:34 query_test/test_mem_usage_scaling.py:116: in low_memory_limit_test
23:18:34 self.run_test_case(tpch_query, new_vector)
23:18:34 common/impala_test_suite.py:472: in run_test_case
23:18:34 result = self.__execute_query(target_impalad_client, query, 
user=user)
23:18:34 common/impala_test_suite.py:699: in __execute_query
23:18:34 return impalad_client.execute(query, user=user)
23:18:34 common/impala_connection.py:174: in execute
23:18:34 return self.__beeswax_client.execute(sql_stmt, user=user)
23:18:34 beeswax/impala_beeswax.py:182: in execute
23:18:34 handle = self.__execute_query(query_string.strip(), user=user)
23:18:34 beeswax/impala_beeswax.py:359: in __execute_query
23:18:34 self.wait_for_finished(handle)
23:18:34 beeswax/impala_beeswax.py:372: in wait_for_finished
23:18:34 query_state = self.get_state(query_handle)
23:18:34 beeswax/impala_beeswax.py:427: in get_state
23:18:34 return self.__do_rpc(lambda: 
self.imp_service.get_state(query_handle))
23:18:34 beeswax/impala_beeswax.py:516: in __do_rpc
23:18:34 raise ImpalaBeeswaxException(self.__build_error_message(u), u)
23:18:34 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
23:18:34 EINNER EXCEPTION: 
23:18:34 EMESSAGE: [Errno 104] Connection reset by peer
{noformat}

>From {{impalad.ERROR}}:

{noformat}
==119152==ERROR: AddressSanitizer: use-after-poison on address 0x63100b771168 
at pc 0x02013e2c bp 0x7f1f7865b640 sp 0x7f1f7865b638
READ of size 1 at 0x63100b771168 thread T59694
#0 0x2013e2b in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
const 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/tuple.h:241:13
#1 0x2809df1 in impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
impala::SlotDescriptor const&, impala::Tuple*, void*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:393:29
#2 0x27755e8 in 
impala::AggFnEvaluator::Finalize(std::vector > const&, impala::Tuple*, 
impala::Tuple*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exprs/agg-fn-evaluator.h:307:15
#3 0x27ab7b6 in 
impala::GroupingAggregator::CleanupHashTbl(std::vector > const&, impala::HashTable::Iterator) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:351:7
#4 0x27abcd2 in impala::GroupingAggregator::ClosePartitions() 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:930:5
#5 0x27ac014 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:383:3
#6 0x2761217 in impala::AggregationNode::Close(impala::RuntimeState*) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/aggregation-node.cc:139:32
#7 0x2069249 in impala::FragmentInstanceState::Close() 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:368:42
#8 0x206457a in impala::FragmentInstanceState::Exec() 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:99:3
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8141) ASAN build failure in query_test/test_mem_usage_scaling.py

2019-01-29 Thread Lars Volker (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker resolved IMPALA-8141.
-
Resolution: Duplicate

> ASAN build failure in query_test/test_mem_usage_scaling.py
> --
>
> Key: IMPALA-8141
> URL: https://issues.apache.org/jira/browse/IMPALA-8141
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Lenisha Gandhi
>Priority: Blocker
>  Labels: asan, build-failure
> Fix For: Impala 3.1.0
>
>
> From the build log:
> {noformat}
> 23:18:34 query_test/test_mem_usage_scaling.py:202: in test_low_mem_limit_q16
> 23:18:34 self.low_memory_limit_test(vector, 'tpch-q16', 
> self.MIN_MEM_FOR_TPCH['Q16'])
> 23:18:34 query_test/test_mem_usage_scaling.py:116: in low_memory_limit_test
> 23:18:34 self.run_test_case(tpch_query, new_vector)
> 23:18:34 common/impala_test_suite.py:472: in run_test_case
> 23:18:34 result = self.__execute_query(target_impalad_client, query, 
> user=user)
> 23:18:34 common/impala_test_suite.py:699: in __execute_query
> 23:18:34 return impalad_client.execute(query, user=user)
> 23:18:34 common/impala_connection.py:174: in execute
> 23:18:34 return self.__beeswax_client.execute(sql_stmt, user=user)
> 23:18:34 beeswax/impala_beeswax.py:182: in execute
> 23:18:34 handle = self.__execute_query(query_string.strip(), user=user)
> 23:18:34 beeswax/impala_beeswax.py:359: in __execute_query
> 23:18:34 self.wait_for_finished(handle)
> 23:18:34 beeswax/impala_beeswax.py:372: in wait_for_finished
> 23:18:34 query_state = self.get_state(query_handle)
> 23:18:34 beeswax/impala_beeswax.py:427: in get_state
> 23:18:34 return self.__do_rpc(lambda: 
> self.imp_service.get_state(query_handle))
> 23:18:34 beeswax/impala_beeswax.py:516: in __do_rpc
> 23:18:34 raise ImpalaBeeswaxException(self.__build_error_message(u), u)
> 23:18:34 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 23:18:34 EINNER EXCEPTION: 
> 23:18:34 EMESSAGE: [Errno 104] Connection reset by peer
> {noformat}
> From {{impalad.ERROR}}:
> {noformat}
> ==119152==ERROR: AddressSanitizer: use-after-poison on address 0x63100b771168 
> at pc 0x02013e2c bp 0x7f1f7865b640 sp 0x7f1f7865b638
> READ of size 1 at 0x63100b771168 thread T59694
> #0 0x2013e2b in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
> const 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/tuple.h:241:13
> #1 0x2809df1 in 
> impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
> impala::SlotDescriptor const&, impala::Tuple*, void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:393:29
> #2 0x27755e8 in 
> impala::AggFnEvaluator::Finalize(std::vector std::allocator > const&, impala::Tuple*, 
> impala::Tuple*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exprs/agg-fn-evaluator.h:307:15
> #3 0x27ab7b6 in 
> impala::GroupingAggregator::CleanupHashTbl(std::vector  std::allocator > const&, 
> impala::HashTable::Iterator) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:351:7
> #4 0x27abcd2 in impala::GroupingAggregator::ClosePartitions() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:930:5
> #5 0x27ac014 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:383:3
> #6 0x2761217 in impala::AggregationNode::Close(impala::RuntimeState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/aggregation-node.cc:139:32
> #7 0x2069249 in impala::FragmentInstanceState::Close() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:368:42
> #8 0x206457a in impala::FragmentInstanceState::Exec() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:99:3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8141) ASAN build failure in query_test/test_mem_usage_scaling.py

2019-01-29 Thread Lars Volker (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755456#comment-16755456
 ] 

Lars Volker commented on IMPALA-8141:
-

The stack trace looks like another duplicate of what we're now tracking in 
IMPALA-8140. Let's stop opening new Jiras for Asan failures with the same stack 
trace until that one is fixed.

> ASAN build failure in query_test/test_mem_usage_scaling.py
> --
>
> Key: IMPALA-8141
> URL: https://issues.apache.org/jira/browse/IMPALA-8141
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Lenisha Gandhi
>Priority: Blocker
>  Labels: asan, build-failure
> Fix For: Impala 3.1.0
>
>
> From the build log:
> {noformat}
> 23:18:34 query_test/test_mem_usage_scaling.py:202: in test_low_mem_limit_q16
> 23:18:34 self.low_memory_limit_test(vector, 'tpch-q16', 
> self.MIN_MEM_FOR_TPCH['Q16'])
> 23:18:34 query_test/test_mem_usage_scaling.py:116: in low_memory_limit_test
> 23:18:34 self.run_test_case(tpch_query, new_vector)
> 23:18:34 common/impala_test_suite.py:472: in run_test_case
> 23:18:34 result = self.__execute_query(target_impalad_client, query, 
> user=user)
> 23:18:34 common/impala_test_suite.py:699: in __execute_query
> 23:18:34 return impalad_client.execute(query, user=user)
> 23:18:34 common/impala_connection.py:174: in execute
> 23:18:34 return self.__beeswax_client.execute(sql_stmt, user=user)
> 23:18:34 beeswax/impala_beeswax.py:182: in execute
> 23:18:34 handle = self.__execute_query(query_string.strip(), user=user)
> 23:18:34 beeswax/impala_beeswax.py:359: in __execute_query
> 23:18:34 self.wait_for_finished(handle)
> 23:18:34 beeswax/impala_beeswax.py:372: in wait_for_finished
> 23:18:34 query_state = self.get_state(query_handle)
> 23:18:34 beeswax/impala_beeswax.py:427: in get_state
> 23:18:34 return self.__do_rpc(lambda: 
> self.imp_service.get_state(query_handle))
> 23:18:34 beeswax/impala_beeswax.py:516: in __do_rpc
> 23:18:34 raise ImpalaBeeswaxException(self.__build_error_message(u), u)
> 23:18:34 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 23:18:34 EINNER EXCEPTION: 
> 23:18:34 EMESSAGE: [Errno 104] Connection reset by peer
> {noformat}
> From {{impalad.ERROR}}:
> {noformat}
> ==119152==ERROR: AddressSanitizer: use-after-poison on address 0x63100b771168 
> at pc 0x02013e2c bp 0x7f1f7865b640 sp 0x7f1f7865b638
> READ of size 1 at 0x63100b771168 thread T59694
> #0 0x2013e2b in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) 
> const 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/tuple.h:241:13
> #1 0x2809df1 in 
> impala::AggFnEvaluator::SerializeOrFinalize(impala::Tuple*, 
> impala::SlotDescriptor const&, impala::Tuple*, void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exprs/agg-fn-evaluator.cc:393:29
> #2 0x27755e8 in 
> impala::AggFnEvaluator::Finalize(std::vector std::allocator > const&, impala::Tuple*, 
> impala::Tuple*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exprs/agg-fn-evaluator.h:307:15
> #3 0x27ab7b6 in 
> impala::GroupingAggregator::CleanupHashTbl(std::vector  std::allocator > const&, 
> impala::HashTable::Iterator) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:351:7
> #4 0x27abcd2 in impala::GroupingAggregator::ClosePartitions() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:930:5
> #5 0x27ac014 in impala::GroupingAggregator::Close(impala::RuntimeState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/grouping-aggregator.cc:383:3
> #6 0x2761217 in impala::AggregationNode::Close(impala::RuntimeState*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/exec/aggregation-node.cc:139:32
> #7 0x2069249 in impala::FragmentInstanceState::Close() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:368:42
> #8 0x206457a in impala::FragmentInstanceState::Exec() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/fragment-instance-state.cc:99:3
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8136) Investigate NoClassDefFoundError when running maven tests

2019-01-29 Thread Quanlong Huang (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755471#comment-16755471
 ] 

Quanlong Huang commented on IMPALA-8136:


That's really possible! It may be throw outside the try-catch clause. I 
encountered this in both 2.x and master branch.

> Investigate NoClassDefFoundError when running maven tests
> -
>
> Key: IMPALA-8136
> URL: https://issues.apache.org/jira/browse/IMPALA-8136
> Project: IMPALA
>  Issue Type: Task
>Reporter: Quanlong Huang
>Priority: Major
>
> I encountered a NoClassDefFoundError when running maven tests. It's resolved 
> unexpectedly by restarting the mini-cluster. My operations are:
>  # ./buildall.sh in a clean clone
>  # kill the processes after backend tests and frontend tests finish
>  # run FE tests again like
> {code:java}
> (push fe && mvn test -Dtest=AuthorizationTest){code}
> Then I encountered the following error:
> {code:java}
> Running org.apache.impala.analysis.AuthorizationTest
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.008 sec <<< 
> FAILURE! - in org.apache.impala.analysis.AuthorizationTest
> initializationError(org.apache.impala.analysis.AuthorizationTest)  Time 
> elapsed: 0.006 sec  <<< ERROR!
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.impala.analysis.AuthorizationTest
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>     at 
> org.junit.runners.Parameterized.allParameters(Parameterized.java:280)
>     at org.junit.runners.Parameterized.(Parameterized.java:248)
>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>     at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>     at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>     at 
> org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:104)
>     at 
> org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:86)
>     at 
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
>     at 
> org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:26)
>     at 
> org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:59)
>     at 
> org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:33)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
>     at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
>     at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
>     at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
>     at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Results :
> Tests in error:
>   AuthorizationTest.initializationError » NoClassDefFound Could not 
> initialize c...
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 19.717 s
> [INFO] Finished at: 2019-01-29T03:38:16-08:00
> [INFO] Final Memory: 116M/2090M
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.18:test (default-test) on 
> project impala-frontend: There are test failures.
> [ERROR]
> [ERROR] Please refer to /tmp/jenkins/workspace/impala-hulu/logs/fe_tests for 
> the individual test results.
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run

[jira] [Work started] (IMPALA-7265) Cache remote file handles

2019-01-29 Thread Joe McDonnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7265 started by Joe McDonnell.
-
> Cache remote file handles
> -
>
> Key: IMPALA-7265
> URL: https://issues.apache.org/jira/browse/IMPALA-7265
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>
> The file handle cache currently does not allow caching remote file handles. 
> This means that clusters that have a lot of remote reads can suffer from 
> overloading the NameNode. Impala should be able to cache remote file handles.
> There are some open questions about remote file handles and whether they 
> behave differently from local file handles. In particular:
>  # Is there any resource constraint on the number of remote file handles 
> open? (e.g. do they maintain a network connection?)
>  # Are there any semantic differences in how remote file handles behave when 
> files are deleted, overwritten, or appended?
>  # Are there any extra failure cases for remote file handles? (i.e. if a 
> machine goes down or a remote file handle is left open for an extended period 
> of time)
> The form of caching will depend on the answers, but at the very least, it 
> should be possible to cache a remote file handle at the level of a query so 
> that a Parquet file with multiple columns can share file handles.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-4003) DDL statements taking too long

2019-01-29 Thread bharath v (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-4003.
---
Resolution: Cannot Reproduce

Not much actionable information.

> DDL statements taking too long
> --
>
> Key: IMPALA-4003
> URL: https://issues.apache.org/jira/browse/IMPALA-4003
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.6.0
>Reporter: Dimitris Tsirogiannis
>Priority: Minor
>  Labels: catalog-server, performance
>
> In some cases, DDL statements are taking too long to complete. For example, 
> in the following sequence: 
> {code}
> create table foo(a int); 
> drop table foo;
> {code}
> the DROP statement takes more than 2-3 seconds to complete even though the 
> table is empty and, consequently, there isn't much to load, in terms of 
> metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-3681) Automatic invalidation of HMS metadata using Hive replication API (HIVE-7973)

2019-01-29 Thread bharath v (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-3681.
---
Resolution: Duplicate

> Automatic invalidation of HMS metadata using Hive replication API (HIVE-7973)
> -
>
> Key: IMPALA-3681
> URL: https://issues.apache.org/jira/browse/IMPALA-3681
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 2.6.0
>Reporter: Dimitris Tsirogiannis
>Priority: Minor
>  Labels: catalog-server
>
> Hive is adding support for a replication API 
> (https://issues.apache.org/jira/browse/HIVE-7973) that allows an HMS client 
> to retrieve a global stream of HMS modifications made by all the HMS 
> processes (in case of HMS HA). The Catalog server could use that API to 
> automatically collect and update cached HMS metadata, thus eliminating the 
> need for the user to call invalidate and refresh. This can only handle HMS 
> metadata; file and block metadata will still need to be updated from the 
> Namenode (e.g. by using inotify).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7034) Increase scalability of metadata handling

2019-01-29 Thread bharath v (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755488#comment-16755488
 ] 

bharath v commented on IMPALA-7034:
---

[~thundergun] This should be substantially improved with IMPALA-7127 and 
IMPALA-2649. Can we resolve this?

> Increase scalability of metadata handling
> -
>
> Key: IMPALA-7034
> URL: https://issues.apache.org/jira/browse/IMPALA-7034
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.13.0
>Reporter: Vincent Tran
>Priority: Major
>
> Currently the practical limit for catalog topic update is in the neighborhood 
> of 4GB - the fundamental limit of max thrift message size. This is an 
> architectural limitation and not a resource limitation
> Larger enterprise clusters with high file counts can easily surpass this with 
> normal usage.
> The high level ask here is for a more scalable implementation for metadata 
> handling. The amount metadata that a cluster can handle should be 
> proportional to the amount of hardware resource that an user is willing to 
> allocate to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-3605) Frontend OOM during catalog update may end up infinite catalog update loop

2019-01-29 Thread bharath v (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-3605.
---
Resolution: Fixed

Not much actionable information. Some significant improvements have landed in 
this area since this jira was created. Resolving.

> Frontend OOM during catalog update may end up infinite catalog update loop
> --
>
> Key: IMPALA-3605
> URL: https://issues.apache.org/jira/browse/IMPALA-3605
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.5.0
>Reporter: Huaisi Xu
>Priority: Minor
>  Labels: catalog-server
>
> This ends up an infinite loop. need also take care of a lot of other things.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-3547) Bound the size of Impalad catalog cache

2019-01-29 Thread bharath v (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bharath v resolved IMPALA-3547.
---
Resolution: Duplicate

This is resolved via IMPALA-7127 (and its subtasks).

> Bound the size of Impalad catalog cache
> ---
>
> Key: IMPALA-3547
> URL: https://issues.apache.org/jira/browse/IMPALA-3547
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.5.0
>Reporter: Dimitris Tsirogiannis
>Priority: Minor
>  Labels: catalog-server, usability
>
> Currently the catalog cache of every Impalad holds a replica of the entire 
> catalog. In certain scenarios (e.g. large metadata), the local catalog cache 
> is using memory that could be used more effectively elsewhere. We should 
> consider changing the impalad catalog cache into a proper cache with a 
> configurable size limit and an invalidation mechanism.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8143) Add features to DoRpcWithRetry()

2019-01-29 Thread Andrew Sherman (JIRA)

Andrew Sherman created IMPALA-8143:
--

 Summary: Add features to DoRpcWithRetry()
 Key: IMPALA-8143
 URL: https://issues.apache.org/jira/browse/IMPALA-8143
 Project: IMPALA
  Issue Type: Task
Reporter: Andrew Sherman
Assignee: Andrew Sherman


DoRpcWithRetry() is a templated utility function that is currently in 
control-service.h that is used to retry synchronous Krpc calls. It makes a call 
to a Krpc function that is is passed as a lambda function. It sets the krpc 
timeout to the ‘krpc_timeout‘ parameter and calls the Krpc function a number of 
times controlled by the ‘times_to_try’ parameter.

Possible improvements:
 * Move code to rpc-mgr.inline.h
 * Add a configurable sleep if RpcMgr::IsServerTooBusy() says the remote 
server’s queue is full.
 * Make QueryState::ReportExecStatus() use DoRpcWithRetry()
 * Consider if asynchronous code like that in KrpcDataStreamSender::Channel  
can also use DoRpcWithRetry()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8142) ASAN build failure in query_test/test_nested_types.py

2019-01-29 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-8142:
---

 Summary: ASAN build failure in query_test/test_nested_types.py
 Key: IMPALA-8142
 URL: https://issues.apache.org/jira/browse/IMPALA-8142
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Lenisha Gandhi
 Fix For: Impala 3.1.0


>From the build log:

{noformat}
05:23:33 === FAILURES 
===
05:23:33  TestNestedTypes.test_subplan[protocol: beeswax | exec_option: 
{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none] 
05:23:33 [gw7] linux2 -- Python 2.7.5 
/data/jenkins/workspace/impala-cdh6.x-core-asan/repos/Impala/bin/../infra/python/env/bin/python
05:23:33 query_test/test_nested_types.py:77: in test_subplan
05:23:33 self.run_test_case('QueryTest/nested-types-subplan', vector)
05:23:33 common/impala_test_suite.py:472: in run_test_case
05:23:33 result = self.__execute_query(target_impalad_client, query, 
user=user)
05:23:33 common/impala_test_suite.py:699: in __execute_query
05:23:33 return impalad_client.execute(query, user=user)
05:23:33 common/impala_connection.py:174: in execute
05:23:33 return self.__beeswax_client.execute(sql_stmt, user=user)
05:23:33 beeswax/impala_beeswax.py:200: in execute
05:23:33 result = self.fetch_results(query_string, handle)
05:23:33 beeswax/impala_beeswax.py:445: in fetch_results
05:23:33 exec_result = self.__fetch_results(query_handle, max_rows)
05:23:33 beeswax/impala_beeswax.py:456: in __fetch_results
05:23:33 results = self.__do_rpc(lambda: self.imp_service.fetch(handle, 
False, fetch_rows))
05:23:33 beeswax/impala_beeswax.py:512: in __do_rpc
05:23:33 raise ImpalaBeeswaxException(self.__build_error_message(e), e)
05:23:33 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
05:23:33 EINNER EXCEPTION: 
05:23:33 EMESSAGE: TSocket read 0 bytes
{noformat}

>From {{impalad.ERROR}}:

{noformat}
SUMMARY: AddressSanitizer: use-after-poison 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/runtime/tuple.h:241:13
 in impala::Tuple::IsNull(impala::NullIndicatorOffset const&) const
...
==119152==ABORTING
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8143) Add features to DoRpcWithRetry()

2019-01-29 Thread Andrew Sherman (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16755529#comment-16755529
 ] 

Andrew Sherman commented on IMPALA-8143:


cc: [~kwho]  [~twmarshall] 

> Add features to DoRpcWithRetry()
> 
>
> Key: IMPALA-8143
> URL: https://issues.apache.org/jira/browse/IMPALA-8143
> Project: IMPALA
>  Issue Type: Task
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
>
> DoRpcWithRetry() is a templated utility function that is currently in 
> control-service.h that is used to retry synchronous Krpc calls. It makes a 
> call to a Krpc function that is is passed as a lambda function. It sets the 
> krpc timeout to the ‘krpc_timeout‘ parameter and calls the Krpc function a 
> number of times controlled by the ‘times_to_try’ parameter.
> Possible improvements:
>  * Move code to rpc-mgr.inline.h
>  * Add a configurable sleep if RpcMgr::IsServerTooBusy() says the remote 
> server’s queue is full.
>  * Make QueryState::ReportExecStatus() use DoRpcWithRetry()
>  * Consider if asynchronous code like that in KrpcDataStreamSender::Channel  
> can also use DoRpcWithRetry()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-8111) Document workaround for some authentication issues with KRPC

2019-01-29 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-8111.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Document workaround for some authentication issues with KRPC
> 
>
> Key: IMPALA-8111
> URL: https://issues.apache.org/jira/browse/IMPALA-8111
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Ho
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_32
> Fix For: Impala 3.2.0
>
>
> There have been complaints from users about not being able to use Impala 
> after upgrading to Impala version with KRPC enabled due to authentication 
> issues. Please document them in the known issues or best practice guide.
> 1. https://issues.apache.org/jira/browse/IMPALA-7585:
>  *Symptoms*: When using Impala with LDAP enabled, a user may hit the 
> following:
> {noformat}
> Not authorized: Client connection negotiation failed: client connection to 
> 127.0.0.1:27000: SASL(-1): generic failure: All-whitespace username.
> {noformat}
> *Root cause*: The following sequence can lead to the user "impala" not being 
> created in /etc/passwd.
> {quote}time 1: no impala in LDAP; things get installed; impala created in 
> /etc/passwd
>  time 2: impala added to LDAP
>  time 3: new machine added
> {quote}
> *Workaround*:
>  - Manually edit /etc/passwd to add the impala user
>  - Upgrade to a version of Impala with the patch IMPALA-7585
> 2. https://issues.apache.org/jira/browse/IMPALA-7298
>  *Symptoms*: When running with Kerberos enabled, a user may hit the following 
> error:
> {noformat}
> WARNINGS: TransmitData() to X.X.X.X:27000 failed: Not authorized: Client 
> connection negotiation failed: client connection to X.X.X.X:27000: Server 
> impala/x.x@vpc.cloudera.com not found in Kerberos database
> {noformat}
> *Root cause*:
>  KrpcDataStreamSender passes a resolved IP address when creating a proxy. 
> Instead, we should pass both the resolved address and the hostname when 
> creating the proxy so that we won't end up using the IP address as the 
> hostname in the Kerberos principal.
> *Workaround*:
>  - Set rdns=true in /etc/krb5.conf
>  - Upgrade to a version of Impala with the fix of IMPALA-7298
> 3. https://issues.apache.org/jira/browse/KUDU-2198
>  *Symptoms*: When running with Kerberos enabled, a user may hit the following 
> error message where  is some random string which doesn't match 
> the primary in the Kerberos principal
> {noformat}
> WARNINGS: TransmitData() to X.X.X.X:27000 failed: Remote error: Not 
> authorized: {username='', principal='impala/redacted'} is not 
> allowed to access DataStreamService
> {noformat}
> *Root cause*:
>  Due to system "auth_to_local" mapping, the principal may be mapped to some 
> local name.
> *Workaround*:
>  - Start Impala with the flag {{--use_system_auth_to_local=false}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-8134) Update docs to reflect CGroups memory limit changes

2019-01-29 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-8134.
---
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Update docs to reflect CGroups memory limit changes
> ---
>
> Key: IMPALA-8134
> URL: https://issues.apache.org/jira/browse/IMPALA-8134
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: future_release_doc, in_32
> Fix For: Impala 3.2.0
>
>
> https://gerrit.cloudera.org/#/c/12293/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8144) Build failed during C++ build

2019-01-29 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-8144:
---

 Summary: Build failed during C++ build
 Key: IMPALA-8144
 URL: https://issues.apache.org/jira/browse/IMPALA-8144
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers
Assignee: Lenisha Gandhi
 Fix For: Impala 3.1.0


Latest master build failed during BE compilation:

{noformat}
14:15:58 [ 63%] Built target Exec
14:15:58 make: *** [all] Error 2
14:15:58 ERROR in 
/data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/make_impala.sh 
at line 180: ${MAKE_CMD} ${MAKE_ARGS}
14:15:58 Generated: 
/data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.make_impala.20190129_22_15_58.xml
{noformat}

The log file mentioned above has exactly the same output, so not super useful.

The Maven log reported the Java builder succeeded. Not other logs are available 
in the build download tar file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8144) Build failed during C++ build

2019-01-29 Thread Paul Rogers (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated IMPALA-8144:

Labels: build-failure  (was: )

> Build failed during C++ build
> -
>
> Key: IMPALA-8144
> URL: https://issues.apache.org/jira/browse/IMPALA-8144
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Paul Rogers
>Assignee: Lenisha Gandhi
>Priority: Blocker
>  Labels: build-failure
> Fix For: Impala 3.1.0
>
>
> Latest master build failed during BE compilation:
> {noformat}
> 14:15:58 [ 63%] Built target Exec
> 14:15:58 make: *** [all] Error 2
> 14:15:58 ERROR in 
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/make_impala.sh
>  at line 180: ${MAKE_CMD} ${MAKE_ARGS}
> 14:15:58 Generated: 
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/extra_junit_xml_logs/generate_junitxml.buildall.make_impala.20190129_22_15_58.xml
> {noformat}
> The log file mentioned above has exactly the same output, so not super useful.
> The Maven log reported the Java builder succeeded. Not other logs are 
> available in the build download tar file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8145) Partition metadata key muddle

2019-01-29 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-8145:
---

Summary: Partition metadata key muddle
Key: IMPALA-8145
URL: https://issues.apache.org/jira/browse/IMPALA-8145
Project: IMPALA
Issue Type: Bug
Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Paul Rogers

Impala stores metadata, including about HDFS partitions. Partitions are defined
as a collection of keys which are in terms of {{(column, value)}} pairs. For
example, "year=2018/month=1". The columns are defined in HMS with a name and
type. Values are defined as part of the partition definition.

Impala performs partition pruning. This means that a query that says {{WHERE
month=2}} will omit the above partition, but will scan one for
"year=2018/month=2". To perform the pruning, the value of the partition key
must be converted from text (as used to define the directory) to the same type
as a column, say TINYINT here.

Conversion is done in the catalog server when loading a partition. Given the
type of the column, the catalog parses the string value of the key, in this
case into a NumericLiteral of type TINYINT. The resulting object is then
converted into a Thrift TExpr node, sent over the network to the Coordinator,
where it is deserialized back into a NumericLiteral.

All of this works fine for String and integer keys. It fails, however, for
float and double keys. (Let's set aside the fact that partitioning on floating
point numbers is a very bad idea for a number of reasons. Impala supports this
bad choice. Our job here is just to deal with that decision.)

NumericLiteral stores its value as a Java BigDecimal. BigDecimal stores values
in decimal and so can easily represent, say 0.1, if that is the partition key.
Unfortunately, floating point numbers are binary, and cannot accurately
represent anything other than a sum of binary fractions. The value 0.1 is a
repeating fraction in binary.

Because of magic I don't fully understand, storing 0.1 as double will render
0.1 when printed. Presumably the floating point standard handles this in some
way.

But, when the process above occurs, upon deserialization from Thrift, the
double value is converted to a BigDecimal. The result is the value
1.100088817841970012523233890533447265625. That is, BigDecimal is
more precise than double, and can represent (in decimal) the sum of binary
fractions used to approximate 0.1

This issue is fully described in the [BigDecimal
javadoc|https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html#BigDecimal-double-],
using, as it happens, the very value of 0.1 discussed above.

The result is that, if a query is planned in local catalog mode, partition
pruning for "WHERE float_col=0.1" works, because the parser parses "0.1"
directly from string to BigDecimal, then onto decimal.

But, if the same query is planned in traditional model, the extra Thrift
conversion cause the bogus value shown above to be used in comparisons,
resulting in a failed partition match.

The temporary solution is to convert from double in Thrift back to string, and
from String to BigDecimal. This is, obviously, quite silly.

The bigger issue is that there is no good reason for the catalog server to
parse partition keys into literal expressions only to be converted to Thrift.
Better to leave the partition keys as strings and allow the coordinator to do
any required parsing to literal expressions.

Note that, in the current design, with code before a recent revision, the
catalog server must analyze each literal expression, but there is no
coordinator to provide the analyzer, so special code was needed to allow
analysis with a null analyzer, needlessly complicating the logic.

--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

40 matches

Mail list logo