[jira] [Commented] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212814#comment-17212814
 ] 

Tim Armstrong commented on IMPALA-9815:
---

It hit the same issue on a couple of jackson jars- 
org/codehaus/jackson/jackson-core-asl/1.9.13-cloudera.1/jackson-core-asl-1.9.13-cl
oudera.1.pom and 
org/codehaus/jackson/jackson-mapper-asl/1.9.13-cloudera.2/jackson-mapper-asl-1
.9.13-cloudera.2.pom . Uploaded those too

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9343) Ensure that multithreaded plans are shown correctly in exec summary, profile, etc.

2020-10-12 Thread Shant Hovsepian (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212798#comment-17212798
 ] 

Shant Hovsepian commented on IMPALA-9343:
-

Posted a partial fix on [https://gerrit.cloudera.org/c/16588/]

It treats JoinBuildSink fragments similarly to DataStreamSink fragments, so we 
get a dashed lined into the join node to signify the fragment acting as the 
source. I'm not sure if we want to call out the join build instead.

!mt_plan.png!

> Ensure that multithreaded plans are shown correctly in exec summary, profile, 
> etc.
> --
>
> Key: IMPALA-9343
> URL: https://issues.apache.org/jira/browse/IMPALA-9343
> Project: IMPALA
>  Issue Type: Task
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: multithreading, observability
> Attachments: mt_dop_plan.json, mt_dop_webui_plan.png, mt_plan.png, 
> non_mt_dop_plan.json, non_mt_dop_webui_plan.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9343) Ensure that multithreaded plans are shown correctly in exec summary, profile, etc.

2020-10-12 Thread Shant Hovsepian (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shant Hovsepian updated IMPALA-9343:

Attachment: mt_plan.png

> Ensure that multithreaded plans are shown correctly in exec summary, profile, 
> etc.
> --
>
> Key: IMPALA-9343
> URL: https://issues.apache.org/jira/browse/IMPALA-9343
> Project: IMPALA
>  Issue Type: Task
>Reporter: Tim Armstrong
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: multithreading, observability
> Attachments: mt_dop_plan.json, mt_dop_webui_plan.png, mt_plan.png, 
> non_mt_dop_plan.json, non_mt_dop_webui_plan.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212743#comment-17212743
 ] 

Tim Armstrong commented on IMPALA-9815:
---

It looks like the maven repo from the CDP dependencies references this Cloudear 
log4j version but does not actually include it. It really should be including 
it.

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212741#comment-17212741
 ] 

Tim Armstrong commented on IMPALA-9815:
---

It worked for me locally but rebuilding here to confirm - 
https://jenkins.impala.io/job/all-build-options-ub1604/6492

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212737#comment-17212737
 ] 

Tim Armstrong commented on IMPALA-9815:
---

As a workaround, I uploaded the artifacts to  
https://native-toolchain.s3.amazonaws.com/build/cdp_components/4493826/maven/log4j/log4j/1.2.17-cloudera1/.
 However this is not a permanent solution.

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212728#comment-17212728
 ] 

Tim Armstrong commented on IMPALA-9815:
---

{noformat}
tarmstrong@tarmstrong-Precision-7540:~/impala/impala/shaded-deps/hive-exec$ mvn 
-U package -DskipTests
[INFO] Scanning for projects...
[INFO]
[INFO] -< org.apache.impala:impala-minimal-hive-exec >-
[INFO] Building impala-minimal-hive-exec 0.1-SNAPSHOT
[INFO] [ jar ]-
Downloading from cdh.rcs.releases.repo: 
https://repository.cloudera.com/content/groups/cdh-releases-rcs/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
Downloading from impala.cdp.repo: 
https://native-toolchain.s3.amazonaws.com/build/cdp_components/4493826/maven/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
Downloading from impala.toolchain.kudu.repo: 
file:///home/tarmstrong/impala/impala/toolchain/toolchain-packages-gcc7.5.0/kudu-5ad5d3d66/java/repository/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
Downloading from cloudera.thirdparty.repo: 
https://repository.cloudera.com/content/repositories/third-party/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
Downloading from central: 
https://repo.maven.apache.org/maven2/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
Downloading from datanucleus: 
http://www.datanucleus.org/downloads/maven2/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 2.834 s
[INFO] Finished at: 2020-10-12T16:33:34-07:00
[INFO] 
[ERROR] Failed to execute goal on project impala-minimal-hive-exec: Could not 
resolve dependencies for project 
org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to collect 
dependencies at org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive:hive-llap-tez:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive:hive-common:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive:hive-shims:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive.shims:hive-shims-common:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.curator:curator-framework:jar:4.3.0.7.2.1.0-287 -> 
org.apache.curator:curator-client:jar:4.3.0.7.2.1.0-287 -> 
org.apache.zookeeper:zookeeper:jar:3.5.5.7.2.1.0-287 -> 
log4j:log4j:jar:1.2.17-cloudera1: Failed to read artifact descriptor for 
log4j:log4j:jar:1.2.17-cloudera1: Could not transfer artifact 
log4j:log4j:pom:1.2.17-cloudera1 from/to impala.cdp.repo 
(https://native-toolchain.s3.amazonaws.com/build/cdp_components/4493826/maven): 
Access denied to: 
https://native-toolchain.s3.amazonaws.com/build/cdp_components/4493826/maven/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
 , ReasonPhrase:Forbidden. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException

{noformat}

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> 

[jira] [Commented] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212716#comment-17212716
 ] 

Tim Armstrong commented on IMPALA-9815:
---

In a previous successful build, this is where it was downloaded from:
{noformat}
00:49:06 [INFO] Downloading from cdh.rcs.releases.repo: 
https://repository.cloudera.com/content/groups/cdh-releases-rcs/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
00:49:06 [INFO] Downloaded from cdh.rcs.releases.repo: 
https://repository.cloudera.com/content/groups/cdh-releases-rcs/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
 (22 kB at 37 kB/s)
00:51:15 [INFO] Downloading from cdh.rcs.releases.repo: 
https://repository.cloudera.com/content/groups/cdh-releases-rcs/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.jar
00:51:16 [INFO] Downloaded from cdh.rcs.releases.repo: 
https://repository.cloudera.com/content/groups/cdh-releases-rcs/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.jar
 (492 kB at 113 kB/s)
{noformat}

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-1173) create-load-data.sh shouldn't try to do load-data.py --force when loading from a snapshot

2020-10-12 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-1173.
---
Resolution: Won't Fix

Dataload has gotten much faster, and it is rare to use snapshots to load data 
on a personal machine. If we decide to address deficiencies there, we'll open a 
new JIRA.

> create-load-data.sh shouldn't try to do load-data.py --force when loading 
> from a snapshot
> -
>
> Key: IMPALA-1173
> URL: https://issues.apache.org/jira/browse/IMPALA-1173
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.0
>Reporter: Daniel Hecht
>Assignee: Joe McDonnell
>Priority: Minor
>
> testdata/bin/create-load-data.sh first loads a snapshot.  Afterwards, it 
> checks to make sure the loaded schema matches that in git.  If it doesn't 
> match, it forces a reload through load-data.py.
> If the user supplied a snapshot file, then I think it would be better to fail 
> when the schema mismatch is detected rather than falling back to the 
> load_data.py --force path.  It seems more likely that the user would prefer 
> to download an updated snapshot to resolve the situation.
> This has burned me a couple of times now when I've downloaded snapshots in 
> the window between the schema update and when the new snapshot is ready.  
> Surprising (to me at least), the scripts went down the load_data.py --force 
> path, which led to another problem (which Lenni as since fixed). But it would 
> have been better if the script just told me that my snapshot is out of date 
> to begin with.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9320) test_udf_concurrency.TestUdfConcurrency.test_concurrent_jar_drop_use failed with error hdfs path doesn't exist

2020-10-12 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-9320.
---
Resolution: Cannot Reproduce

> test_udf_concurrency.TestUdfConcurrency.test_concurrent_jar_drop_use failed 
> with error hdfs path doesn't exist
> --
>
> Key: IMPALA-9320
> URL: https://issues.apache.org/jira/browse/IMPALA-9320
> Project: IMPALA
>  Issue Type: Test
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Xiaomeng Zhang
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: broken-build
>
> {code:java}
> custom_cluster/test_udf_concurrency.py:162: in test_concurrent_jar_drop_use
> self.filesystem_client.copy_from_local(udf_src_path, udf_tgt_path)
> util/hdfs_util.py:82: in copy_from_local
> self.hdfs_filesystem_client.copy_from_local(src, dst)
> util/hdfs_util.py:256: in copy_from_local
> src, dst) + stderr + '; ' + stdout
> E   AssertionError: HDFS copy from 
> /data/jenkins/workspace/impala-cdpd-master-exhaustive/repos/Impala/testdata/udfs/impala-hive-udfs.jar
>  to 
> /test-warehouse/test_concurrent_jar_drop_use_91093fa5.db/impala-hive-udfs.jar 
> failed: copyFromLocal: 
> `/test-warehouse/test_concurrent_jar_drop_use_91093fa5.db/impala-hive-udfs.jar':
>  No such file or directory: 
> `hdfs://localhost:20500/test-warehouse/test_concurrent_jar_drop_use_91093fa5.db/impala-hive-udfs.jar'
> E   ;
> {code}
> [https://master-02.jenkins.cloudera.com/job/impala-cdpd-master-exhaustive/244/testReport/junit/custom_cluster.test_udf_concurrency/TestUdfConcurrency/test_concurrent_jar_drop_use_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]
> This test has been continuously failing in last 10 builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9815:
--
Attachment: mvn.1602463897.937399486.log

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
> Attachments: mvn.1602463897.937399486.log
>
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10043) Keep all the logs when using EE_TEST_SHARDS > 1

2020-10-12 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-10043.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> Keep all the logs when using EE_TEST_SHARDS > 1
> ---
>
> Key: IMPALA-10043
> URL: https://issues.apache.org/jira/browse/IMPALA-10043
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
> Fix For: Impala 4.0
>
>
> The fix for IMPALA-9887 speeds up ASAN builds by adding the ability to shard 
> EE tests and restart Impala between them. When EE_TEST_SHARDS is set, each 
> restart of Impala will generate new INFO, ERROR, WARNING glogs. 
> Unfortunately, the max_log_files is set to 10 by default, so the older logs 
> will be deleted to make way for the new logs.
> We should change it to a higher value to keep all of the logs when using 
> EE_TEST_SHARDS. This is something that we already do for custom cluster tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9184) TestImpalaShellInteractive.test_ddl_queries_are_closed is flaky

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9184.
---
Resolution: Cannot Reproduce

I tried to reproduce on an ASAN build locally with no luck. Please reopen if it 
reoccurs, but we should make sure to get more logs to help debug.

> TestImpalaShellInteractive.test_ddl_queries_are_closed is flaky
> ---
>
> Key: IMPALA-9184
> URL: https://issues.apache.org/jira/browse/IMPALA-9184
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: broken-build, flaky
>
> The following test is flaky:
> shell.test_shell_interactive.TestImpalaShellInteractive.test_ddl_queries_are_closed[table_format_and_file_extension:
>  ('textfile', '.txt') | protocol: beeswax] (from pytest)
> Error Message
> {code:java}
> AssertionError: drop query should be closed assert  ImpaladService.wait_for_num_in_flight_queries of 
> >(0) + where 
>  > = 
>  0x8a1fad0>.wait_for_num_in_flight_queries
> {code}
> Stacktrace
> {code:java}
> Impala/tests/shell/test_shell_interactive.py:338: in 
> test_ddl_queries_are_closed assert impalad.wait_for_num_in_flight_queries(0), 
> MSG % 'drop' E AssertionError: drop query should be closed E assert  method ImpaladService.wait_for_num_in_flight_queries of 
> >(0) E + 
> where  > = 
>  0x8a1fad0>.wait_for_num_in_flight_queries
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10234) impala-shell: add support for cookie-based authentication

2020-10-12 Thread Attila Jeges (Jira)
Attila Jeges created IMPALA-10234:
-

 Summary: impala-shell: add support for cookie-based authentication
 Key: IMPALA-10234
 URL: https://issues.apache.org/jira/browse/IMPALA-10234
 Project: IMPALA
  Issue Type: New Feature
  Components: Clients
Affects Versions: Impala 3.4.0
Reporter: Attila Jeges
Assignee: Attila Jeges
 Fix For: Impala 4.0


IMPALA-8584 added support for cookie authentication to Impala. Need to add 
cookie authentication support to impala-shell as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8925) Consider replacing ClientRequestState ResultCache with result spooling

2020-10-12 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved IMPALA-8925.
--
Resolution: Later

This would be nice to have, but not seeing a strong reason to do this at the 
moment. So closing as "Later".

> Consider replacing ClientRequestState ResultCache with result spooling
> --
>
> Key: IMPALA-8925
> URL: https://issues.apache.org/jira/browse/IMPALA-8925
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend, Clients
>Reporter: Sahil Takiar
>Priority: Minor
>
> The {{ClientRequestState}} maintains an internal results cache (which is 
> really just a {{QueryResultSet}}) in order to provide support for the 
> {{TFetchOrientation.FETCH_FIRST}} fetch orientation (used by Hue - see 
> [https://github.com/apache/impala/commit/6b769d011d2016a73483f63b311e108d17d9a083]).
> The cache itself has some limitations:
>  * It caches all results in a {{QueryResultSet}} with limited admission 
> control integration
>  * It has a max size, if the size is exceeded the cache is emptied
>  * It cannot spill to disk
> Result spooling could potentially replace the query result cache and provide 
> a few benefits; it should be able to fit more rows since it can spill to 
> disk. The memory is better tracked as well since it integrates with both 
> admitted and reserved memory. Hue currently sets the max result set fetch 
> size to 
> [https://github.com/cloudera/hue/blob/master/apps/impala/src/impala/impala_flags.py#L61],
>  would be good to check how well that value works for Hue users so we can 
> decide if replacing the current result cache with result spooling makes sense.
> This would require some changes to result spooling as well, currently it 
> discards rows whenever it reads them from the underlying 
> {{BufferedTupleStream}}. It would need the ability to reset the read cursor, 
> which would require some changes to the {{PlanRootSink}} interface as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9828) Add support for spilling to S3

2020-10-12 Thread Jim Apple (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212558#comment-17212558
 ] 

Jim Apple commented on IMPALA-9828:
---

Got it. Thanks for the extra info!

> Add support for spilling to S3
> --
>
> Key: IMPALA-9828
> URL: https://issues.apache.org/jira/browse/IMPALA-9828
> Project: IMPALA
>  Issue Type: Epic
>  Components: Backend
>Reporter: Abhishek Rawat
>Assignee: Yida Wu
>Priority: Major
> Fix For: Impala 4.0
>
>
> Epic for spill to S3 support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10038) TestScannersFuzzing::()::test_fuzz_alltypes timed out after 2 hours (may be flaky)

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10038:
---
Priority: Critical  (was: Blocker)

> TestScannersFuzzing::()::test_fuzz_alltypes timed out after 2 hours (may be 
> flaky)
> --
>
> Key: IMPALA-10038
> URL: https://issues.apache.org/jira/browse/IMPALA-10038
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
> Environment: Centos 7.4, 16 vCPU, 64 GB RAM, data cache enabled
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: broken-build
>
> This was seen on Centos 7., with the data cache enabled during and exhaustive 
> run
> Test step:
> {code}
> query_test.test_scanners_fuzz.TestScannersFuzzing.test_fuzz_alltypes[protocol:
>  beeswax | exec_option: {'debug_action': 
> '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 'abort_on_error': False, 
> 'mem_limit': '512m', 'num_nodes': 0} | table_format: avro/none]
> {code}
> Test backtrace:
> {code}
> query_test/test_scanners_fuzz.py:82: in test_fuzz_alltypes
> self.run_fuzz_test(vector, src_db, table_name, unique_database, 
> table_name)
> query_test/test_scanners_fuzz.py:238: in run_fuzz_test
> result = self.execute_query(query, query_options = query_options)
> common/impala_test_suite.py:811: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:843: in execute_query
> return self.__execute_query(self.client, query, query_options)
> common/impala_test_suite.py:909: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:205: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:365: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:389: in wait_for_finished
> time.sleep(0.05)
> E   Failed: Timeout >7200s
> {code}
> Captured stderr:
> {code}
> ~ Stack of  (139967888086784) 
> ~
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 277, in _perform_spawn
> reply.run()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 213, in run
> self._result = func(*args, **kwargs)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 954, in _thread_receiver
> msg = Message.from_io(io)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 418, in from_io
> header = io.read(9)  # type 1, channel 4, payload 4
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 386, in read
> data = self._read(numbytes-len(buf))
> ERROR:impala_test_suite:Should not throw error when abort_on_error=0: 
> 'Timeout >7200s'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9828) Add support for spilling to S3

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212543#comment-17212543
 ] 

Tim Armstrong commented on IMPALA-9828:
---

[~jbapple] yeah exactly - a lot of cloud instances have fairly constrained 
local storage so this is a way to allow queries to run to completion when local 
storage is exhausted. 

Mounting additional non-local block storage like EBS is an alternative, but 
then you either need to pre-provision the volumes (requires planning and/or 
gets expensive) or come up with some kind of dynamic provisioning scheme 
(complex, not portable across cloud providers, etc).

BTW, this approach isn't really limited to S3, it could be extended to any 
other storage with a HDFS connector.

> Add support for spilling to S3
> --
>
> Key: IMPALA-9828
> URL: https://issues.apache.org/jira/browse/IMPALA-9828
> Project: IMPALA
>  Issue Type: Epic
>  Components: Backend
>Reporter: Abhishek Rawat
>Assignee: Yida Wu
>Priority: Major
> Fix For: Impala 4.0
>
>
> Epic for spill to S3 support.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10233:
---
Priority: Blocker  (was: Critical)

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Blocker
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10233:
---
Component/s: Backend

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Blocker
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10233:
---
Target Version: Impala 4.0

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Blocker
>  Labels: crash
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212500#comment-17212500
 ] 

Tim Armstrong commented on IMPALA-10233:


[~luksan] [~boroknagyz] this seems bad.

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Blocker
>  Labels: crash
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10233:
---
Labels: crash  (was: )

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Blocker
>  Labels: crash
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10055) DCHECK was hit while executing e2e test TestQueries::test_subquery

2020-10-12 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212499#comment-17212499
 ] 

Sahil Takiar commented on IMPALA-10055:
---

Saw this again recently, any plans for a fix?

> DCHECK was hit while executing e2e test TestQueries::test_subquery
> --
>
> Key: IMPALA-10055
> URL: https://issues.apache.org/jira/browse/IMPALA-10055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit while executing e2e test. Time frame suggests that it 
> possibly happened while executing TestQueries::test_subquery:
> {code}
> query_test/test_queries.py:149: in test_subquery
> self.run_test_case('QueryTest/subquery', vector)
> common/impala_test_suite.py:662: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:600: in __exec_in_impala
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:909: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:334: in execute
> r = self.__fetch_results(handle, profile_format=profile_format)
> common/impala_connection.py:436: in __fetch_results
> result_tuples = cursor.fetchall()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:532:
>  in fetchall
> self._wait_to_finish()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:405:
>  in _wait_to_finish
> resp = self._last_operation._rpc('GetOperationStatus', req)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:992:
>  in _rpc
> response = self._execute(func_name, request)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:1023:
>  in _execute
> .format(self.retries))
> E   HiveServer2Error: Failed after retrying 3 times
> {code}
> impalad log:
> {code}
> Log file created at: 2020/08/05 17:34:30
> Running on machine: 
> impala-ec2-centos74-m5-4xlarge-ondemand-18a5.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 17:34:30.003247 10887 orc-column-readers.cc:423] 
> c34e87376f496a53:7ba6a2e40002] Check failed: (scanner_->row_batches_nee
> d_validation_ && scanner_->scan_node_->IsZeroSlotTableScan()) || 
> scanner_->acid_original_file
> {code}
> Stack trace:
> {code}
> CORE: ./fe/core.1596674070.14179.impalad
> BINARY: ./be/build/latest/service/impalad
> Core was generated by 
> `/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> To enable execution of this file add
>   add-auto-load-safe-path 
> /data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib64/libstdc++.so.6.0.24-gdb.py
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> To completely disable this security protection add
>   set auto-load safe-path /
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
>   info "(gdb)Auto-loading safe path"
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> #1  0x7efd6ec6f8e8 in abort () from /lib64/libc.so.6
> #2  0x086b8ea4 in google::DumpStackTraceAndExit() ()
> #3  0x086ae25d in google::LogMessage::Fail() ()
> #4  0x086afb4d in google::LogMessage::SendToLog() ()
> #5  0x086adbbb in google::LogMessage::Flush() ()
> #6  0x086b17b9 in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x0388e10a in impala::OrcStructReader::TopLevelReadValueBatch 
> (this=0x61162630, scratch_batch=0x824831e0, pool=0x82483258) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/orc-column-readers.cc:421
> #8  0x03810c92 in impala::HdfsOrcScanner::TransferTuples 
> (this=0x27143c00, dst_batch=0x2e5ca820) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:808
> #9  0x03814e2a in impala::HdfsOrcScanner::AssembleRows 
> 

[jira] [Updated] (IMPALA-10193) Limit the memory usage of the whole mini-cluster

2020-10-12 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-10193:
---
Fix Version/s: Impala 4.0

> Limit the memory usage of the whole mini-cluster
> 
>
> Key: IMPALA-10193
> URL: https://issues.apache.org/jira/browse/IMPALA-10193
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.4.0
>Reporter: Fifteen
>Assignee: Fifteen
>Priority: Minor
> Fix For: Impala 4.0
>
> Attachments: image-2020-09-28-17-18-15-358.png
>
>
> The mini-cluster contains 3 virtual nodes, and all of them runs in a single 
> 'Machine'. By quoting, it implies the machine can be a docker container. If 
> the container is started with `-priviledged` and the actual memory is limited 
> by CGROUPS, then the total memory in `htop` and the actual available memory 
> can be different! 
>  
> For example, in the container below, `htop` tells us the total memory is 
> 128GB, while the total memory set in CGROUPS is actually 32GB. If the acutal 
> mem usage exceeds 32GB, process (such as impalad, hivemaster2 etc.) get 
> killed.
>   !image-2020-09-28-17-18-15-358.png!
>  
> So we may need a way to limit the whole mini-cluster's memory usage.
>    



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10220) Min value of RpcNetworkTime can be negative

2020-10-12 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212488#comment-17212488
 ] 

Riza Suminto commented on IMPALA-10220:
---

CR is here: https://gerrit.cloudera.org/c/16552/

> Min value of RpcNetworkTime can be negative
> ---
>
> Key: IMPALA-10220
> URL: https://issues.apache.org/jira/browse/IMPALA-10220
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Affects Versions: Impala 3.4.0
>Reporter: Riza Suminto
>Assignee: Riza Suminto
>Priority: Major
>
> There is a bug in function 
> KrpcDataStreamSender::Channel::EndDataStreamCompleteCb(), particularly in 
> this line:
> [https://github.com/apache/impala/blob/d453d52/be/src/runtime/krpc-data-stream-sender.cc#L635]
> network_time_ns should be computed using eos_rsp_.receiver_latency_ns() 
> instead of resp_.receiver_latency_ns().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10224) Add startup flag not to expose debug web url via PingImpalaService/PingImpalaHS2Service RPC

2020-10-12 Thread Attila Jeges (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212250#comment-17212250
 ] 

Attila Jeges commented on IMPALA-10224:
---

CR: https://gerrit.cloudera.org/#/c/16573/

> Add startup flag not to expose debug web url via 
> PingImpalaService/PingImpalaHS2Service RPC
> ---
>
> Key: IMPALA-10224
> URL: https://issues.apache.org/jira/browse/IMPALA-10224
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: Attila Jeges
>Assignee: Attila Jeges
>Priority: Major
>
> PingImpalaService/PingImpalaHS2Service RPC calls expose the coordinator's 
> debug web url to clients like impala shell.  Since the debug web UI is not 
> something that end-users will necessarily have access to, we should have a 
> server option to send an empty string instead of the real url to the impala 
> client signalling that the debug web ui is not available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-10233:

Priority: Critical  (was: Major)

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10166) ALTER TABLE for Iceberg tables

2020-10-12 Thread WangSheng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10166 started by WangSheng.
--
> ALTER TABLE for Iceberg tables
> --
>
> Key: IMPALA-10166
> URL: https://issues.apache.org/jira/browse/IMPALA-10166
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: WangSheng
>Priority: Major
>  Labels: impala-iceberg
>
> Add support for ALTER TABLE operations for Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10166) ALTER TABLE for Iceberg tables

2020-10-12 Thread WangSheng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212239#comment-17212239
 ] 

WangSheng commented on IMPALA-10166:


HI [~boroknagyz], I will try to implement this as soon as possible.

> ALTER TABLE for Iceberg tables
> --
>
> Key: IMPALA-10166
> URL: https://issues.apache.org/jira/browse/IMPALA-10166
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: WangSheng
>Priority: Major
>  Labels: impala-iceberg
>
> Add support for ALTER TABLE operations for Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10166) ALTER TABLE for Iceberg tables

2020-10-12 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212234#comment-17212234
 ] 

Zoltán Borók-Nagy commented on IMPALA-10166:


Hi [~skyyws], thanks for picking this up!

> ALTER TABLE for Iceberg tables
> --
>
> Key: IMPALA-10166
> URL: https://issues.apache.org/jira/browse/IMPALA-10166
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Zoltán Borók-Nagy
>Assignee: WangSheng
>Priority: Major
>  Labels: impala-iceberg
>
> Add support for ALTER TABLE operations for Iceberg tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212230#comment-17212230
 ] 

Quanlong Huang edited comment on IMPALA-9815 at 10/12/20, 8:30 AM:
---

See this again in a recent build: 
https://jenkins.impala.io/job/all-build-options-ub1604/6484/console

{code}
[ERROR] Failed to execute goal on project impala-minimal-hive-exec: Could not 
resolve dependencies for project 
org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to collect 
dependencies at org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive:hive-llap-tez:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive:hive-common:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive:hive-shims:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.hive.shims:hive-shims-common:jar:3.1.3000.7.2.1.0-287 -> 
org.apache.curator:curator-framework:jar:4.3.0.7.2.1.0-287 -> 
org.apache.curator:curator-client:jar:4.3.0.7.2.1.0-287 -> 
org.apache.zookeeper:zookeeper:jar:3.5.5.7.2.1.0-287 -> 
log4j:log4j:jar:1.2.17-cloudera1: Failed to read artifact descriptor for 
log4j:log4j:jar:1.2.17-cloudera1: Could not transfer artifact 
log4j:log4j:pom:1.2.17-cloudera1 from/to impala.cdp.repo 
(https://native-toolchain.s3.amazonaws.com/build/cdp_components/4493826/maven): 
Access denied to: 
https://native-toolchain.s3.amazonaws.com/build/cdp_components/4493826/maven/log4j/log4j/1.2.17-cloudera1/log4j-1.2.17-cloudera1.pom
 , ReasonPhrase:Forbidden. -> [Help 1]
{code}


was (Author: stiga-huang):
See this again in a recent build: 
https://jenkins.impala.io/job/all-build-options-ub1604/6484/console

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10056) Keep the HDFS / Kudu cluster logs for the docker-based tests

2020-10-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Garaguly reassigned IMPALA-10056:


Assignee: Zoltán Garaguly

> Keep the HDFS / Kudu cluster logs for the docker-based tests
> 
>
> Key: IMPALA-10056
> URL: https://issues.apache.org/jira/browse/IMPALA-10056
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Zoltán Garaguly
>Priority: Major
>
> The Impala test environment has a symlink from logs/cluster/cdh7-node-* to 
> locations in testdata/cluster. When running the docker-based tests, the logs/ 
> directory is preserved beyond the lifetime of the container. However, 
> testdata/cluster is not preserved, so the symlinks are not valid and those 
> logs are not currently preserved.
> The HDFS and Kudu logs in logs/cluster/cdh7-node-* are very useful, so we 
> should preserve them. One option is to copy them to the logs/ directory just 
> before stopping the container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-9815) Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000.xxxx during build

2020-10-12 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reopened IMPALA-9815:


See this again in a recent build: 
https://jenkins.impala.io/job/all-build-options-ub1604/6484/console

> Intermittent failure downloading org.apache.hive:hive-exec:jar:3.1.3000. 
> during build
> -
>
> Key: IMPALA-9815
> URL: https://issues.apache.org/jira/browse/IMPALA-9815
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, flaky
>
> This is an intermittent failure; sometimes 
> org.apache.hive:hive-exec:jar:3.1.3000 fails to be downloaded, breaking the 
> build. One telltale sign is a build failure happening early, at about 5 
> minutes into the build. The build error signature is:
> {code}
> 05:36:55 [ERROR] Failed to execute goal on project impala-minimal-hive-exec: 
> Could not resolve dependencies for project 
> org.apache.impala:impala-minimal-hive-exec:jar:0.1-SNAPSHOT: Failed to 
> collect dependencies for [org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112 
> (compile)]: Failed to read artifact descriptor for 
> org.apache.hive:hive-exec:jar:3.1.3000.7.2.1.0-112: Could not transfer 
> artifact org.apache.hive:hive-exec:pom:3.1.3000.7.2.1.0-112 from/to 
> impala.cdh.repo 
> (https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven):
>  Access denied to: 
> https://native-toolchain.s3.amazonaws.com/build/cdh_components/1814051/maven/org/apache/hive/hive-exec/3.1.3000.7.2.1.0-112/hive-exec-3.1.3000.7.2.1.0-112.pom,
>  ReasonPhrase:Forbidden. -> [Help 1]
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 05:36:55 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 05:36:55 [ERROR] 
> 05:36:55 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 05:36:55 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> 05:36:55 mvn -U -s 
> /data/jenkins/workspace/impala-cdpd-master-core/repos/Impala-auxiliary-tests/jenkins/m2-settings.xml
>  -U -B install -DskipTests exited with code 0
> 05:36:55 make[2]: *** [shaded-deps/CMakeFiles/shaded-deps] Error 1
> 05:36:55 make[1]: *** [shaded-deps/CMakeFiles/shaded-deps.dir/all] Error 2
> 05:36:55 make[1]: *** Waiting for unfinished jobs
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-10233:
---

Assignee: Quanlong Huang

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned table with zorder

2020-10-12 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-10233:

Summary: Hit DCHECK in DmlExecState::AddPartition when inserting to a 
partitioned table with zorder  (was: Hit DCHECK in DmlExecState::AddPartition)

> Hit DCHECK in DmlExecState::AddPartition when inserting to a partitioned 
> table with zorder
> --
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Priority: Major
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
> a6479cc4725101fd:b86db2a10003] Check failed: 
> per_partition_status_.find(name) == per_partition_status_.end() 
> *** Check failure stack trace: *** 
> @  0x51ff3cc  google::LogMessage::Fail()
> @  0x5200cbc  google::LogMessage::SendToLog()
> @  0x51fed2a  google::LogMessage::Flush()
> @  0x5202928  google::LogMessageFatal::~LogMessageFatal()
> @  0x234ba18  impala::DmlExecState::AddPartition()
> @  0x2817786  impala::HdfsTableSink::GetOutputPartition()
> @  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
> @  0x28156c4  impala::HdfsTableSink::Send()
> @  0x23139dd  impala::FragmentInstanceState::ExecInternal()
> @  0x230fe10  impala::FragmentInstanceState::Exec()
> @  0x227bb79  impala::QueryState::ExecFInstance()
> @  0x2279f7b  
> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> @  0x227e2c2  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x2137699  boost::function0<>::operator()()
> @  0x2715d7d  impala::Thread::SuperviseThread()
> @  0x271dd1a  boost::_bi::list5<>::operator()<>()
> @  0x271dc3e  boost::_bi::bind_t<>::operator()()
> @  0x271dbff  boost::detail::thread_data<>::run()
> @  0x3f05f01  thread_proxy
> @ 0x7fb18bebb6b9  start_thread
> @ 0x7fb188a474dc  clone {code}
> It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
> Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
> input must be ordered by the partition key expressions. So a partition key 
> was deleted and then inserted again to the 
> {{partition_keys_to_output_partitions_}} map.
> {code:c++}
>   /// Maps all rows in 'batch' to partitions and appends them to their 
> temporary Hdfs
>   /// files. The input must be ordered by the partition key expressions.
>   Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
> WARN_UNUSED_RESULT;
> {code}
> The key got removed here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
>  when processing a new partition key.
> It got reinserted here: 
> https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
>  so hit the DCHECK.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition

2020-10-12 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-10233:

Description: 
Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
on master branch (commit=b8a2b75).
{code:java}
F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
a6479cc4725101fd:b86db2a10003] Check failed: 
per_partition_status_.find(name) == per_partition_status_.end() 
*** Check failure stack trace: *** 
@  0x51ff3cc  google::LogMessage::Fail()
@  0x5200cbc  google::LogMessage::SendToLog()
@  0x51fed2a  google::LogMessage::Flush()
@  0x5202928  google::LogMessageFatal::~LogMessageFatal()
@  0x234ba18  impala::DmlExecState::AddPartition()
@  0x2817786  impala::HdfsTableSink::GetOutputPartition()
@  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
@  0x28156c4  impala::HdfsTableSink::Send()
@  0x23139dd  impala::FragmentInstanceState::ExecInternal()
@  0x230fe10  impala::FragmentInstanceState::Exec()
@  0x227bb79  impala::QueryState::ExecFInstance()
@  0x2279f7b  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
@  0x227e2c2  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
@  0x2137699  boost::function0<>::operator()()
@  0x2715d7d  impala::Thread::SuperviseThread()
@  0x271dd1a  boost::_bi::list5<>::operator()<>()
@  0x271dc3e  boost::_bi::bind_t<>::operator()()
@  0x271dbff  boost::detail::thread_data<>::run()
@  0x3f05f01  thread_proxy
@ 0x7fb18bebb6b9  start_thread
@ 0x7fb188a474dc  clone {code}

It seems the zorder sort node doesn't keep the rows sorted by partition keys. 
Thus violates the assumption of HdfsTableSink::WriteClusteredRowBatch() that 
input must be ordered by the partition key expressions. So a partition key was 
deleted and then inserted again to the {{partition_keys_to_output_partitions_}} 
map.
{code:c++}
  /// Maps all rows in 'batch' to partitions and appends them to their 
temporary Hdfs
  /// files. The input must be ordered by the partition key expressions.
  Status WriteClusteredRowBatch(RuntimeState* state, RowBatch* batch) 
WARN_UNUSED_RESULT;
{code}
The key got removed here: 
https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L334
 when processing a new partition key.
It got reinserted here: 
https://github.com/apache/impala/blob/b8a2b754669eb7f8d164e8112e594ac413e436ef/be/src/exec/hdfs-table-sink.cc#L590
 so hit the DCHECK.

  was:
Hit the DCHECK when inserting to a parquet table. I'm on master branch 
(commit=b8a2b75).
{code:java}
F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
a6479cc4725101fd:b86db2a10003] Check failed: 
per_partition_status_.find(name) == per_partition_status_.end() 
*** Check failure stack trace: *** 
@  0x51ff3cc  google::LogMessage::Fail()
@  0x5200cbc  google::LogMessage::SendToLog()
@  0x51fed2a  google::LogMessage::Flush()
@  0x5202928  google::LogMessageFatal::~LogMessageFatal()
@  0x234ba18  impala::DmlExecState::AddPartition()
@  0x2817786  impala::HdfsTableSink::GetOutputPartition()
@  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
@  0x28156c4  impala::HdfsTableSink::Send()
@  0x23139dd  impala::FragmentInstanceState::ExecInternal()
@  0x230fe10  impala::FragmentInstanceState::Exec()
@  0x227bb79  impala::QueryState::ExecFInstance()
@  0x2279f7b  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
@  0x227e2c2  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
@  0x2137699  boost::function0<>::operator()()
@  0x2715d7d  impala::Thread::SuperviseThread()
@  0x271dd1a  boost::_bi::list5<>::operator()<>()
@  0x271dc3e  boost::_bi::bind_t<>::operator()()
@  0x271dbff  boost::detail::thread_data<>::run()
@  0x3f05f01  thread_proxy
@ 0x7fb18bebb6b9  start_thread
@ 0x7fb188a474dc  clone {code}


> Hit DCHECK in DmlExecState::AddPartition
> 
>
> Key: IMPALA-10233
> URL: https://issues.apache.org/jira/browse/IMPALA-10233
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Priority: Major
>
> Hit the DCHECK when inserting to a partitioned parquet table with zorder. I'm 
> on master branch (commit=b8a2b75).
> {code:java}
> F1012 15:04:27.726274  

[jira] [Created] (IMPALA-10233) Hit DCHECK in DmlExecState::AddPartition

2020-10-12 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-10233:
---

 Summary: Hit DCHECK in DmlExecState::AddPartition
 Key: IMPALA-10233
 URL: https://issues.apache.org/jira/browse/IMPALA-10233
 Project: IMPALA
  Issue Type: Bug
Reporter: Quanlong Huang


Hit the DCHECK when inserting to a parquet table. I'm on master branch 
(commit=b8a2b75).
{code:java}
F1012 15:04:27.726274  3868 dml-exec-state.cc:432] 
a6479cc4725101fd:b86db2a10003] Check failed: 
per_partition_status_.find(name) == per_partition_status_.end() 
*** Check failure stack trace: *** 
@  0x51ff3cc  google::LogMessage::Fail()
@  0x5200cbc  google::LogMessage::SendToLog()
@  0x51fed2a  google::LogMessage::Flush()
@  0x5202928  google::LogMessageFatal::~LogMessageFatal()
@  0x234ba18  impala::DmlExecState::AddPartition()
@  0x2817786  impala::HdfsTableSink::GetOutputPartition()
@  0x2813151  impala::HdfsTableSink::WriteClusteredRowBatch()
@  0x28156c4  impala::HdfsTableSink::Send()
@  0x23139dd  impala::FragmentInstanceState::ExecInternal()
@  0x230fe10  impala::FragmentInstanceState::Exec()
@  0x227bb79  impala::QueryState::ExecFInstance()
@  0x2279f7b  _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
@  0x227e2c2  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE
@  0x2137699  boost::function0<>::operator()()
@  0x2715d7d  impala::Thread::SuperviseThread()
@  0x271dd1a  boost::_bi::list5<>::operator()<>()
@  0x271dc3e  boost::_bi::bind_t<>::operator()()
@  0x271dbff  boost::detail::thread_data<>::run()
@  0x3f05f01  thread_proxy
@ 0x7fb18bebb6b9  start_thread
@ 0x7fb188a474dc  clone {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org