[jira] [Resolved] (IMPALA-9920) BufferPoolTest.WriteErrorBlacklistHolepunch failed on FindPageInDir check

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9920.
---
Resolution: Cannot Reproduce

I looped for a while locally under UBSAN and couldn't reproduce.

> BufferPoolTest.WriteErrorBlacklistHolepunch failed on FindPageInDir check
> -
>
> Key: IMPALA-9920
> URL: https://issues.apache.org/jira/browse/IMPALA-9920
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
> Environment: BUILD_TAG
> jenkins-impala-cdpd-master-core-ubsan-108
>Reporter: Wenzhe Zhou
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: broken-build, build, test
> Attachments: buffer-pool-test.ERROR
>
>
> BufferPoolTest.WriteErrorBlacklistHolepunch failed with following error 
> messages:
> Error Message
> Value of: FindPageInDir(pages[NO_ERROR_QUERY], good_dir) != NULL
>  Actual: false
> Expected: true
> Stacktrace
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/runtime/bufferpool/buffer-pool-test.cc:1763
> Value of: FindPageInDir(pages[NO_ERROR_QUERY], good_dir) != NULL
>  Actual: false
> Expected: true
>  
> Saw following messages in buffer-pool-test.ERROR log file:
>  F0702 14:17:08.745453 8976 reservation-tracker.cc:389] Check failed: bytes 
> <= unused_reservation() (1024 vs. 0) 
> F0702 14:17:14.581707 12727 reservation-tracker.cc:389] Check failed: bytes 
> <= unused_reservation() (1024 vs. 0) 
> F0702 14:17:14.671219 12728 reservation-tracker.cc:389] Check failed: bytes 
> <= unused_reservation() (1024 vs. 0) 
> F0702 14:17:14.840062 12940 buffer-pool.cc:216] Check failed: 
> page_handle->is_pinned() 
> F0702 14:17:15.167520 13459 buffer-pool.cc:493] Check failed: 
> spilling_enabled() 
> F0702 14:17:15.326829 13946 reservation-tracker.cc:389] Check failed: bytes 
> <= unused_reservation() (1024 vs. 0) 
> E0702 14:17:17.119957 16180 tmp-file-mgr.cc:334] Error for temporary file 
> '/tmp/impala-scratch/:_354de439-8418-4013-8ebe-55214f8396c5':
>  Disk I/O error on 
> impala-ec2-centos74-m5-4xlarge-ondemand-1884.vpc.cloudera.com:22000: open() 
> failed for 
> /tmp/impala-scratch/:_354de439-8418-4013-8ebe-55214f8396c5.
>  Access denied for the process' user errno=13
> E0702 14:17:17.270885 16570 tmp-file-mgr.cc:334] Error for temporary file 
> '/tmp/buffer-pool-test.0/impala-scratch/:_21e8f7c1-2a63-44d5-8a5c-4ba78e9acb6e':
>  Disk I/O error on 
> impala-ec2-centos74-m5-4xlarge-ondemand-1884.vpc.cloudera.com:22000: open() 
> failed for 
> /tmp/buffer-pool-test.0/impala-scratch/:_21e8f7c1-2a63-44d5-8a5c-4ba78e9acb6e.
>  Access denied for the process' user errno=13
> E0702 14:17:17.436445 16964 tmp-file-mgr.cc:334] Error for temporary file 
> '/tmp/buffer-pool-test.0/impala-scratch/:_78d2ef68-254a-4738-baf6-6e80b3b1ee2a':
>  Disk I/O error on 
> impala-ec2-centos74-m5-4xlarge-ondemand-1884.vpc.cloudera.com:22000: open() 
> failed for 
> /tmp/buffer-pool-test.0/impala-scratch/:_78d2ef68-254a-4738-baf6-6e80b3b1ee2a.
>  Access denied for the process' user errno=13



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9851) Query status can be unbounded in size

2020-08-06 Thread Riza Suminto (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-9851.
--
Fix Version/s: Impala 4.0
   Resolution: Fixed

The fix has been merged.

> Query status can be unbounded in size
> -
>
> Key: IMPALA-9851
> URL: https://issues.apache.org/jira/browse/IMPALA-9851
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Riza Suminto
>Priority: Minor
>  Labels: newbie, ramp-up
> Fix For: Impala 4.0
>
>
> We got a report of a query status that was 10s of MB in size. We should avoid 
> this. Probably we should do 2 things: 1) not include the giant debug string 
> in this error message and 2) cap the error message size somewhere to a sane 
> size, e.g. some number of kiloybytes (not sure where the best place is to 
> implement this)
> {noformat}
> annot perform hash join at node with id 3. Repartitioning did not reduce the 
> size of a spilled partition. Repartitioning level 2. Number of rows 
> 106305207:nPartitionedHashJoinNode (id=3 op=5 state=RepartitioningBuild 
> #spilled_partitions=0)nPhjBuilder: Hash partitions: 16:n Hash partition 0 
> : ptr=0x1f395c020 Closedn Hash partition 1 : 
> ptr=0x4db0a6700 Closedn Hash partition 2 : ptr=0xd4145b20 Closedn 
> Hash partition 3 : ptr=0x494467860 Closedn Hash partition 4 
> : ptr=0xd4145c40 Closedn Hash partition 5 : 
> ptr=0x38efb3be0 Closedn Hash partition 6 : ptr=0x4db0a7bc0 Closedn 
> Hash partition 7 : ptr=0x14dcd2040 Closedn Hash partition 8 
> : ptr=0x14dcd3500 Closedn Hash partition 9 : 
> ptr=0x4febfe40 Closedn Hash partition 10 : ptr=0xd4145ec0 Spilledn 
>Build Rows: 106305207 (Bytes pinned: 0)nn Hash partition 11 : 
> ptr=0x7f34acfb8040 Closedn Hash partition 12 : ptr=0x14dcd3e00 
> Closedn Hash partition 13 : ptr=0x4944661e0 Closedn Hash partition 
> 14 : ptr=0xd4145740 Closedn Hash partition 15 : 
> ptr=0x7f1cc7566c00 ClosednProbe hash partitions: 0:nInputPartition: 
> 0x1dfe9a5f0n   Build Partition Closedn   Spilled Probe Rows: 
> 2608nn 0x12f09320 internal state: { 
> 0x7f30643f8780 name: HASH_JOIN_NODE id=3 ptr=0x12f09180 write_status:  
> buffers allocated 0 num_pages: 189834 pinned_bytes: 65536 
> dirty_unpinned_bytes: 12127043584 in_flight_write_bytes: 3145728 reservation: 
> {: reservation_limit 9223372036854775807 reservation 
> 12441812992 used_reservation 65536 child_reservations 65536 
> parent:n: reservation_limit 9223372036854775807 
> reservation 12479561728 used_reservation 0 child_reservations 12479561728 
> parent:n: reservation_limit 128 reservation 
> 12481593344 used_reservation 0 child_reservations 12481593344 
> parent:n: reservation_limit 273804165120 reservation 
> 12665946112 used_reservation 0 child_reservations 12665946112 parent:nNULL}n  
> 1 pinned pages:  0x7f348aab4140 len: 65536 pin_count: 1 
> buf:  0x7f348aab41b8 client: 
> 0x12f09320/0x7f30643f8780 data: 0x15e32000 len: 65536nn  185044 dirty 
> unpinned pages:  0xc676b020 len: 65536 pin_count: 0 buf: 
>  0xc676b098 client: 0x12f09320/0x7f30643f8780 data: 
> 0x2ed4a len: 65536n 0xc6769180 len: 65536 pin_count: 0 
> buf:  0xc67691f8 client: 0x12f09320/0x7f30643f8780 
> data: 0x2ed4b len: 65536n 0xc676a260 len: 65536 
> pin_count: 0 buf:  0xc676a2d8 client: 
> 0x12f09320/0x7f30643f8780 data: 0x2ed4c len: 65536n 
> 0xc676b660 len: 65536 pin_count: 0 buf:  0xc676b6d8 
> client: 0x12f09320/0x7f30643f8780 data: 0x2ed4d len: 
> 65536n 0xc67685a0 len: 65536 pin_count: 0 buf: 
>  0xc6768618 client: 0x12f09320/0x7f30643f8780 data: 
> 0x2ed4e len: 65536n 0xc676bd40 len: 65536 pin_count: 0 
> buf:  0xc676bdb8 client: 0x12f09320/0x7f30643f8780 
> data: 0x2ed4f len: 65536n 0xc676a580 len: 65536 
> pin_count: 0 buf:  0xc676a5f8 client: 
> 0x12f09320/0x7f30643f8780 data: 0x2ed50 len: 65536n 
> 0xc676a940 len: 65536 pin_count: 0 buf:  0xc676a9b8 
> client: 0x12f09320/0x7f30643f8780 data: 0x2ed526000 len: 
> 65536n 0xc676a6c0 len: 65536 pin_count: 0 buf: 
>  0xc676a738 client: 0x12f09320/0x7f30643f8780 data: 
> 0x2ed536000 len: 65536n 0xc67699a0 len: 65536 pin_count: 0 
> buf:  0xc6769a18 client: 0x12f09320/0x7f30643f8780 
> data: 0x2ed546000 len: 65536n 0x1652ee8c0 len: 65536 
> pin_count: 0 buf:  0x1652ee938 client: 
> 0x12f09320/0x7f30643f8780 data: 0x2ed556000 len: 65536n 
> 0x1652f0ee0 len: 65536 pin_count: 0 buf:  
> 0x1652f0f58 client: 0x12f09320/0x7f30643f8780 data: 0x2ed566000 len: 
> 65536n 0x1652ef0e0 len: 65536 pin_count: 0 buf: 
>  0x1652ef158 client: 0x12f09320/0x7f30643f8780 
> data: 0x2ed58a000 len: 65536n 0x1652f04e0 len: 65536 
> pin_count: 0 buf:  0x1652f0558 client: 
> 0x12f09320/0x7f30643f8780 data: 0x2ed59a000 len: 65536n 
> 0x1652f1660 len: 65536 pin_count: 0 buf:  
> 0x1652f16d8 client: 

[jira] [Resolved] (IMPALA-10044) bin/bootstrap_toolchain.py error handling can delete the toolchain directory

2020-08-06 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-10044.

 Fix Version/s: Impala 4.0
Target Version: Impala 4.0
  Assignee: Joe McDonnell
Resolution: Fixed

> bin/bootstrap_toolchain.py error handling can delete the toolchain directory
> 
>
> Key: IMPALA-10044
> URL: https://issues.apache.org/jira/browse/IMPALA-10044
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.0
>
>
> In bin/bootstrap_toolchain.py's DownloadUnpackTarball download() function, 
> the exception handler code will delete the download directory:
> {code:java}
> except:  # noqa
>   # Clean up any partially-unpacked result.
>   if os.path.isdir(unpack_dir):
> shutil.rmtree(unpack_dir)
>   if os.path.isdir(download_dir): # < wrong
> shutil.rmtree(download_dir)
>   raise
> {code}
> This is incorrect. It should only delete the download directory if the 
> download directory is a temporary directory. Otherwise, it would be deleting 
> the actual toolchain directory (and forcing a redownload of everything).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10043) Keep all the logs when using EE_TEST_SHARDS > 1

2020-08-06 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reassigned IMPALA-10043:
--

Assignee: Joe McDonnell

> Keep all the logs when using EE_TEST_SHARDS > 1
> ---
>
> Key: IMPALA-10043
> URL: https://issues.apache.org/jira/browse/IMPALA-10043
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>
> The fix for IMPALA-9887 speeds up ASAN builds by adding the ability to shard 
> EE tests and restart Impala between them. When EE_TEST_SHARDS is set, each 
> restart of Impala will generate new INFO, ERROR, WARNING glogs. 
> Unfortunately, the max_log_files is set to 10 by default, so the older logs 
> will be deleted to make way for the new logs.
> We should change it to a higher value to keep all of the logs when using 
> EE_TEST_SHARDS. This is something that we already do for custom cluster tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10044) bin/bootstrap_toolchain.py error handling can delete the toolchain directory

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172802#comment-17172802
 ] 

ASF subversion and git services commented on IMPALA-10044:
--

Commit bbec0443fcdabf5de6f7ae0e47595414503f30f0 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=bbec044 ]

IMPALA-10044: Fix cleanup for bootstrap_toolchain.py failure case

If DownloadUnpackTarball::download()'s wget_and_unpack_package call
hits an exception, the exception handler cleans up any created
directories. Currently, it erroneously cleans up the directory where
the tarballs are downloaded even when it is not a temporary directory.
This would delete the entire toolchain.

This fixes the cleanup to only delete that directory if it is a
temporary directory.

Testing:
 - Simulated exception from wget_and_unpack_package and verified
   behavior.

Change-Id: Ia57f56b6717635af94247fce50b955c07a57d113
Reviewed-on: http://gerrit.cloudera.org:8080/16294
Reviewed-by: Laszlo Gaal 
Tested-by: Impala Public Jenkins 


> bin/bootstrap_toolchain.py error handling can delete the toolchain directory
> 
>
> Key: IMPALA-10044
> URL: https://issues.apache.org/jira/browse/IMPALA-10044
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Priority: Major
>
> In bin/bootstrap_toolchain.py's DownloadUnpackTarball download() function, 
> the exception handler code will delete the download directory:
> {code:java}
> except:  # noqa
>   # Clean up any partially-unpacked result.
>   if os.path.isdir(unpack_dir):
> shutil.rmtree(unpack_dir)
>   if os.path.isdir(download_dir): # < wrong
> shutil.rmtree(download_dir)
>   raise
> {code}
> This is incorrect. It should only delete the download directory if the 
> download directory is a temporary directory. Otherwise, it would be deleting 
> the actual toolchain directory (and forcing a redownload of everything).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9851) Query status can be unbounded in size

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172800#comment-17172800
 ] 

ASF subversion and git services commented on IMPALA-9851:
-

Commit 86b70e9850cce0b45194a64cd89ae21df0e82029 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=86b70e9 ]

IMPALA-9851: Truncate long error message.

Error message length was unbounded and can grow very large into couple
of MB in size. This patch truncate error message to maximum 128kb in
size.

This patch also fix potentially long error message related to
BufferPool::Client::DebugString(). Before this patch, DebugString() will
print all pages in 'pinned_pages_', 'dirty_unpinned_pages_', and
'in_flight_write_pages_' PageList. With this patch, DebugString() only
include maximum of 100 first pages in each PageList.

Testing:
- Add be test BufferPoolTest.ShortDebugString
- Add test within ErrorMsg.GenericFormatting to test for truncation.
- Run and pass core tests.

Change-Id: Ic9fa4d024fb3dc9de03c7484f41b5e420a710e5a
Reviewed-on: http://gerrit.cloudera.org:8080/16300
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Query status can be unbounded in size
> -
>
> Key: IMPALA-9851
> URL: https://issues.apache.org/jira/browse/IMPALA-9851
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Riza Suminto
>Priority: Minor
>  Labels: newbie, ramp-up
>
> We got a report of a query status that was 10s of MB in size. We should avoid 
> this. Probably we should do 2 things: 1) not include the giant debug string 
> in this error message and 2) cap the error message size somewhere to a sane 
> size, e.g. some number of kiloybytes (not sure where the best place is to 
> implement this)
> {noformat}
> annot perform hash join at node with id 3. Repartitioning did not reduce the 
> size of a spilled partition. Repartitioning level 2. Number of rows 
> 106305207:nPartitionedHashJoinNode (id=3 op=5 state=RepartitioningBuild 
> #spilled_partitions=0)nPhjBuilder: Hash partitions: 16:n Hash partition 0 
> : ptr=0x1f395c020 Closedn Hash partition 1 : 
> ptr=0x4db0a6700 Closedn Hash partition 2 : ptr=0xd4145b20 Closedn 
> Hash partition 3 : ptr=0x494467860 Closedn Hash partition 4 
> : ptr=0xd4145c40 Closedn Hash partition 5 : 
> ptr=0x38efb3be0 Closedn Hash partition 6 : ptr=0x4db0a7bc0 Closedn 
> Hash partition 7 : ptr=0x14dcd2040 Closedn Hash partition 8 
> : ptr=0x14dcd3500 Closedn Hash partition 9 : 
> ptr=0x4febfe40 Closedn Hash partition 10 : ptr=0xd4145ec0 Spilledn 
>Build Rows: 106305207 (Bytes pinned: 0)nn Hash partition 11 : 
> ptr=0x7f34acfb8040 Closedn Hash partition 12 : ptr=0x14dcd3e00 
> Closedn Hash partition 13 : ptr=0x4944661e0 Closedn Hash partition 
> 14 : ptr=0xd4145740 Closedn Hash partition 15 : 
> ptr=0x7f1cc7566c00 ClosednProbe hash partitions: 0:nInputPartition: 
> 0x1dfe9a5f0n   Build Partition Closedn   Spilled Probe Rows: 
> 2608nn 0x12f09320 internal state: { 
> 0x7f30643f8780 name: HASH_JOIN_NODE id=3 ptr=0x12f09180 write_status:  
> buffers allocated 0 num_pages: 189834 pinned_bytes: 65536 
> dirty_unpinned_bytes: 12127043584 in_flight_write_bytes: 3145728 reservation: 
> {: reservation_limit 9223372036854775807 reservation 
> 12441812992 used_reservation 65536 child_reservations 65536 
> parent:n: reservation_limit 9223372036854775807 
> reservation 12479561728 used_reservation 0 child_reservations 12479561728 
> parent:n: reservation_limit 128 reservation 
> 12481593344 used_reservation 0 child_reservations 12481593344 
> parent:n: reservation_limit 273804165120 reservation 
> 12665946112 used_reservation 0 child_reservations 12665946112 parent:nNULL}n  
> 1 pinned pages:  0x7f348aab4140 len: 65536 pin_count: 1 
> buf:  0x7f348aab41b8 client: 
> 0x12f09320/0x7f30643f8780 data: 0x15e32000 len: 65536nn  185044 dirty 
> unpinned pages:  0xc676b020 len: 65536 pin_count: 0 buf: 
>  0xc676b098 client: 0x12f09320/0x7f30643f8780 data: 
> 0x2ed4a len: 65536n 0xc6769180 len: 65536 pin_count: 0 
> buf:  0xc67691f8 client: 0x12f09320/0x7f30643f8780 
> data: 0x2ed4b len: 65536n 0xc676a260 len: 65536 
> pin_count: 0 buf:  0xc676a2d8 client: 
> 0x12f09320/0x7f30643f8780 data: 0x2ed4c len: 65536n 
> 0xc676b660 len: 65536 pin_count: 0 buf:  0xc676b6d8 
> client: 0x12f09320/0x7f30643f8780 data: 0x2ed4d len: 
> 65536n 0xc67685a0 len: 65536 pin_count: 0 buf: 
>  0xc6768618 client: 0x12f09320/0x7f30643f8780 data: 
> 0x2ed4e len: 65536n 0xc676bd40 len: 65536 pin_count: 0 
> buf:  0xc676bdb8 client: 0x12f09320/0x7f30643f8780 
> data: 0x2ed4f len: 65536n 0xc676a580 len: 65536 
> pin_count: 0 buf:  0xc676a5f8 client: 
> 0x12f09320/0x7f30643f8780 

[jira] [Commented] (IMPALA-10053) Remove uses of MonoTime::GetDeltaSince()

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172801#comment-17172801
 ] 

ASF subversion and git services commented on IMPALA-10053:
--

Commit 7a6469e44486191cd344e9f7dcf681763d6091db in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7a6469e ]

IMPALA-10053: Remove uses of MonoTime::GetDeltaSince()

MonoTime is a utility Impala imports from Kudu. The behavior of
MonoTime::GetDeltaSince() was accidentally flipped in
https://gerrit.cloudera.org/#/c/14932/ so we're getting negative
durations where we expect positive durations.

The function is deprecated anyways, so this patch removes all uses of
it and replaces them with the MonoTime '-' operator.

Testing:
- Manually ran with and without patch and inspected calculated values.
- Added DCHECKs to prevent sucn an issue from occurring again.

Change-Id: If8cd3eb51a4fd101bbe4b9c44ea9be6ea2ea0d06
Reviewed-on: http://gerrit.cloudera.org:8080/16296
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Remove uses of MonoTime::GetDeltaSince()
> 
>
> Key: IMPALA-10053
> URL: https://issues.apache.org/jira/browse/IMPALA-10053
> Project: IMPALA
>  Issue Type: Task
>Affects Versions: Impala 4.0, Impala 3.4.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>
> MonoTime is a utility Impala imports from Kudu. The behavior of 
> MonoTime::GetDeltaSince() was accidentally flipped in 
> https://gerrit.cloudera.org/#/c/14932/ so we're getting negative durations 
> where we expect positive durations.
> The function is deprecated anyways, so we should remove all uses of it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-9767) ASAN crash during coordinator runtime filter updates

2020-08-06 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172760#comment-17172760
 ] 

Fang-Yu Rao edited comment on IMPALA-9767 at 8/7/20, 2:09 AM:
--

Hi [~tarmstrong], very sorry that I have not figured out the root cause of this 
broken build. According to the attached full console output corresponding to 
the broken ASAN build ([^consoleFull_asan_939.txt]), there are quite a few 
failed tests (at least around 100).

The first failed test is 
{{query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q1}}
 that occurred at 23:34:50. Following this failed test, we started to see 
failed tests and tests with returned state being {{ERROR}}. I tried to re-run 
those tests that failed from 23:34:00 to 23:34:59 but was not able to reproduce 
the issue.

For easy reference, in what follows I also list the failed tests I had re-run 
locally between the interval mentioned above.
# 
{{query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q*}}
# {{query_test/test_date_queries.py::TestDateQueries::test_queries}}
# {{query_test/test_chars.py::TestStringQueries::test_chars_tmp_tables}}
# 
{{query_test/test_insert_parquet.py::TestInsertParquetQueries::test_insert_parquet}}
# 
{{query_test/test_aggregation.py::TestDistinctAggregation::test_multiple_distinct}}
# {{query_test/test_decimal_queries.py::TestDecimalQueries::test_queries}}
# 
{{query_test/test_aggregation.py::TestTPCHAggregationQueries::test_tpch_passthrough_aggregations}}

Unfortunately I accidentally deleted the build artifacts of this specific ASAN 
build because of insufficient disk space on my machine. Not very sure if 
[~stakiar] still has it.



was (Author: fangyurao):
Hi [~tarmstrong], very sorry that I have not figured out the root cause of this 
broken build. According to the attached full console output corresponding to 
the broken ASAN build ([^consoleFull_asan_939.txt]), there are quite a few 
failed tests (at least around 100).

The first failed test is 
{{query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q1}}
 that occurred at 23:34:50. Following this failed test, we started to see 
failed tests and tests with returned state being {{ERROR}}. I tried to 
reproduce those tests that failed from 23:34:00 to 23:34:59 but was not able to 
reproduce the issue.

For easy reference, in what follows I also list the failed tests I had re-run 
locally between the interval mentioned above.
# 
{{query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q*}}
# {{query_test/test_date_queries.py::TestDateQueries::test_queries}}
# {{query_test/test_chars.py::TestStringQueries::test_chars_tmp_tables}}
# 
{{query_test/test_insert_parquet.py::TestInsertParquetQueries::test_insert_parquet}}
# 
{{query_test/test_aggregation.py::TestDistinctAggregation::test_multiple_distinct}}
# {{query_test/test_decimal_queries.py::TestDecimalQueries::test_queries}}
# 
{{query_test/test_aggregation.py::TestTPCHAggregationQueries::test_tpch_passthrough_aggregations}}

Unfortunately I accidentally deleted the build artifacts of this specific ASAN 
build because of insufficient disk space on my machine. Not very sure if 
[~stakiar] still has it.


> ASAN crash during coordinator runtime filter updates
> 
>
> Key: IMPALA-9767
> URL: https://issues.apache.org/jira/browse/IMPALA-9767
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Fang-Yu Rao
>Priority: Major
>  Labels: asan, broken-build, crash
> Attachments: consoleFull_asan_939.txt
>
>
> ASAN crash output:
> {code:java}
> Error MessageAddress Sanitizer message detected in 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard
>  Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20
> READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552)
> #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x19a46c3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) 
> 

[jira] [Commented] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172769#comment-17172769
 ] 

Riza Suminto commented on IMPALA-10054:
---

CR is here: [https://gerrit.cloudera.org/c/16301]

> test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
> ---
>
> Key: IMPALA-10054
> URL: https://issues.apache.org/jira/browse/IMPALA-10054
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>
> test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
> flaky.
> Jenkins job that triggered the error:
> https://jenkins.impala.io/job/parallel-all-tests-nightly/1173
> Failing job:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/
> {code}
> Stacktrace
> query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
> assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
> E   assert ('SpilledRuns: ' + '3') in 'Query 
> (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil... 27.999ms\n   
>   - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
> WriteIoWaitTime: 0.000ns\n' = 
>  0x7f51da77fb50>.runtime_profile
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9767) ASAN crash during coordinator runtime filter updates

2020-08-06 Thread Fang-Yu Rao (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172760#comment-17172760
 ] 

Fang-Yu Rao commented on IMPALA-9767:
-

Hi [~tarmstrong], very sorry that I have not figured out the root cause of this 
broken build. According to the attached full console output corresponding to 
the broken ASAN build, there are quite a few failed tests (at least around 100).

The first failed test is 
{{query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q1}}
 that occurred at 23:34:50. Following this failed test, we started to see 
failed tests and tests with returned state being {{ERROR}}. I tried to 
reproduce those tests that failed from 23:34:00 to 23:34:59 but was not able to 
reproduce the issue.

For easy reference, in what follows I also list the failed tests I had re-run 
locally between the interval mentioned above.
# 
{{query_test/test_mem_usage_scaling.py::TestTpchMemLimitError::test_low_mem_limit_q*}}
# {{query_test/test_date_queries.py::TestDateQueries::test_queries}}
# {{query_test/test_chars.py::TestStringQueries::test_chars_tmp_tables}}
# 
{{query_test/test_insert_parquet.py::TestInsertParquetQueries::test_insert_parquet}}
# 
{{query_test/test_aggregation.py::TestDistinctAggregation::test_multiple_distinct}}
# {{query_test/test_decimal_queries.py::TestDecimalQueries::test_queries}}
# 
{{query_test/test_aggregation.py::TestTPCHAggregationQueries::test_tpch_passthrough_aggregations}}

Unfortunately I accidentally deleted the build artifacts of this specific ASAN 
build because of insufficient disk space on my machine. Not very sure if 
[~stakiar] still has it.

[^consoleFull_asan_939.txt]

> ASAN crash during coordinator runtime filter updates
> 
>
> Key: IMPALA-9767
> URL: https://issues.apache.org/jira/browse/IMPALA-9767
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Fang-Yu Rao
>Priority: Major
>  Labels: asan, broken-build, crash
> Attachments: consoleFull_asan_939.txt
>
>
> ASAN crash output:
> {code:java}
> Error MessageAddress Sanitizer message detected in 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard
>  Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20
> READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552)
> #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x19a46c3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x331c095 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x3324da1 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x52ca4e2 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52ca4e2)
> #7 0x32aeadc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x52cdb03 in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52cdb03)
> #9 0x32aecd1 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x32c08db in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x2148c26 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x2144b29 in kudu::Thread::SuperviseThread(void*) 
> 

[jira] [Updated] (IMPALA-9767) ASAN crash during coordinator runtime filter updates

2020-08-06 Thread Fang-Yu Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fang-Yu Rao updated IMPALA-9767:

Attachment: consoleFull_asan_939.txt

> ASAN crash during coordinator runtime filter updates
> 
>
> Key: IMPALA-9767
> URL: https://issues.apache.org/jira/browse/IMPALA-9767
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Fang-Yu Rao
>Priority: Major
>  Labels: asan, broken-build, crash
> Attachments: consoleFull_asan_939.txt
>
>
> ASAN crash output:
> {code:java}
> Error MessageAddress Sanitizer message detected in 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard
>  Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20
> READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552)
> #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x19a46c3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x331c095 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x3324da1 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x52ca4e2 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52ca4e2)
> #7 0x32aeadc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x52cdb03 in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52cdb03)
> #9 0x32aecd1 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x32c08db in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x2148c26 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x2144b29 in kudu::Thread::SuperviseThread(void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
> #13 0x7f6c0bcf4e24 in start_thread (/lib64/libpthread.so.0+0x7e24)
> #14 0x7f6c0885834c in __clone (/lib64/libc.so.6+0xf834c)
> 0x7f6288cbe818 is located 24 bytes inside of 1052640-byte region 
> [0x7f6288cbe800,0x7f6288dbf7e0)
> freed by thread T114 here:
> #0 0x1a773e0 in operator delete(void*) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
> #1 0x7f6c090faed3 in __gnu_cxx::new_allocator::deallocate(char*, 
> unsigned long) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:110
> #2 0x7f6c090faed3 in std::string::_Rep::_M_destroy(std::allocator 
> const&) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:449
> #3 0x7f6c090faed3 in std::string::_Rep::_M_dispose(std::allocator 
> const&) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:249
> #4 0x7f6c090faed3 in std::string::reserve(unsigned long) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:511
> #5 0x2781865 in 
> impala::ClientRequestState::UpdateFilter(impala::UpdateFilterParamsPB const&, 
> kudu::rpc::RpcContext*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/service/client-request-state.cc:1451:11
> #6 0x26d57d5 in 
> impala::ImpalaServer::UpdateFilter(impala::UpdateFilterResultPB*, 
> impala::UpdateFilterParamsPB 

[jira] [Resolved] (IMPALA-9503) Expose 'healthz' endpoint for statestored and catalogd

2020-08-06 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-9503.

Resolution: Duplicate

> Expose 'healthz' endpoint for statestored and catalogd
> --
>
> Key: IMPALA-9503
> URL: https://issues.apache.org/jira/browse/IMPALA-9503
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Bikramjeet Vig
>Priority: Major
>
> IMPALA-8895 exposed the end points for impalads. It seems only coordinator 
> and executors expose 'healthz' endpoint. It will be good to expose the 
> endpoints on statestored and catalogd.
>  
> {code:java}
> curl http://localhost:25010/healthz
> No URI handler for '/healthz'
> curl http://localhost:25020/healthz
> No URI handler for '/healthz'{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9503) Expose 'healthz' endpoint for statestored and catalogd

2020-08-06 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-9503:
--

Assignee: Bikramjeet Vig  (was: Alice Fan)

> Expose 'healthz' endpoint for statestored and catalogd
> --
>
> Key: IMPALA-9503
> URL: https://issues.apache.org/jira/browse/IMPALA-9503
> Project: IMPALA
>  Issue Type: Task
>Reporter: Abhishek Rawat
>Assignee: Bikramjeet Vig
>Priority: Major
>
> IMPALA-8895 exposed the end points for impalads. It seems only coordinator 
> and executors expose 'healthz' endpoint. It will be good to expose the 
> endpoints on statestored and catalogd.
>  
> {code:java}
> curl http://localhost:25010/healthz
> No URI handler for '/healthz'
> curl http://localhost:25020/healthz
> No URI handler for '/healthz'{code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10037) BytesRead check in TestMtDopScanNode.test_mt_dop_scan_node is flaky

2020-08-06 Thread Bikramjeet Vig (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-10037.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> BytesRead check in TestMtDopScanNode.test_mt_dop_scan_node is flaky
> ---
>
> Key: IMPALA-10037
> URL: https://issues.apache.org/jira/browse/IMPALA-10037
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10059) Add call to bin/jenkins/finalize.sh for dockerized tests

2020-08-06 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell updated IMPALA-10059:
---
Summary: Add call to bin/jenkins/finalize.sh for dockerized tests  (was: 
Add call to bin/jenkins/finalize.sh for upstream dockerized tests)

> Add call to bin/jenkins/finalize.sh for dockerized tests
> 
>
> Key: IMPALA-10059
> URL: https://issues.apache.org/jira/browse/IMPALA-10059
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Priority: Minor
>
> The build script used for the upstream dockerized tests (i.e.  (i.e. 
> bin/jenkins/dockerized-impala-run-tests.sh) currently does not call 
> finalize.sh. It would be useful to add a call to finalize.sh to symbolize 
> minidumps and generate JUnitXML for Impala crashes, etc. See the call to 
> finalize.sh in bin/jenkins/all-tests.sh for an example.
> The dockerized tests are part of precommit testing via jenkins.impala.io's 
> ubuntu-16.04-dockerised-tests job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10059) Add call to bin/jenkins/finalize.sh for upstream dockerized tests

2020-08-06 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172738#comment-17172738
 ] 

Joe McDonnell commented on IMPALA-10059:


There are some structural differences in the locations of log files for 
dockerized tests, so it may require further changes to support this.

> Add call to bin/jenkins/finalize.sh for upstream dockerized tests
> -
>
> Key: IMPALA-10059
> URL: https://issues.apache.org/jira/browse/IMPALA-10059
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Priority: Minor
>
> The build script used for the upstream dockerized tests (i.e.  (i.e. 
> bin/jenkins/dockerized-impala-run-tests.sh) currently does not call 
> finalize.sh. It would be useful to add a call to finalize.sh to symbolize 
> minidumps and generate JUnitXML for Impala crashes, etc. See the call to 
> finalize.sh in bin/jenkins/all-tests.sh for an example.
> The dockerized tests are part of precommit testing via jenkins.impala.io's 
> ubuntu-16.04-dockerised-tests job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10059) Add call to bin/jenkins/finalize.sh for upstream dockerized tests

2020-08-06 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-10059:
--

 Summary: Add call to bin/jenkins/finalize.sh for upstream 
dockerized tests
 Key: IMPALA-10059
 URL: https://issues.apache.org/jira/browse/IMPALA-10059
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Joe McDonnell


The build script used for the upstream dockerized tests (i.e.  (i.e. 
bin/jenkins/dockerized-impala-run-tests.sh) currently does not call 
finalize.sh. It would be useful to add a call to finalize.sh to symbolize 
minidumps and generate JUnitXML for Impala crashes, etc. See the call to 
finalize.sh in bin/jenkins/all-tests.sh for an example.

The dockerized tests are part of precommit testing via jenkins.impala.io's 
ubuntu-16.04-dockerised-tests job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9767) ASAN crash during coordinator runtime filter updates

2020-08-06 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172731#comment-17172731
 ] 

Tim Armstrong commented on IMPALA-9767:
---

Did we ever figure out what was happening here? I took a look at the code and 
it seemed like this was probably possible if a runtime filter update came in 
and overwrote the bloom filter directory after it was published. Which I think 
would only be possible in the case of an accounting bug with the # of expected 
filters. Or maybe a retry of sending the filter that resulted in the filter 
being received multiple times?

> ASAN crash during coordinator runtime filter updates
> 
>
> Key: IMPALA-9767
> URL: https://issues.apache.org/jira/browse/IMPALA-9767
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Fang-Yu Rao
>Priority: Major
>  Labels: asan, broken-build, crash
>
> ASAN crash output:
> {code:java}
> Error MessageAddress Sanitizer message detected in 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard
>  Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20
> READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552)
> #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x19a46c3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x331c095 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x3324da1 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x52ca4e2 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52ca4e2)
> #7 0x32aeadc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x52cdb03 in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52cdb03)
> #9 0x32aecd1 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x32c08db in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x2148c26 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x2144b29 in kudu::Thread::SuperviseThread(void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
> #13 0x7f6c0bcf4e24 in start_thread (/lib64/libpthread.so.0+0x7e24)
> #14 0x7f6c0885834c in __clone (/lib64/libc.so.6+0xf834c)
> 0x7f6288cbe818 is located 24 bytes inside of 1052640-byte region 
> [0x7f6288cbe800,0x7f6288dbf7e0)
> freed by thread T114 here:
> #0 0x1a773e0 in operator delete(void*) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
> #1 0x7f6c090faed3 in __gnu_cxx::new_allocator::deallocate(char*, 
> unsigned long) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:110
> #2 0x7f6c090faed3 in std::string::_Rep::_M_destroy(std::allocator 
> const&) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:449
> #3 0x7f6c090faed3 in std::string::_Rep::_M_dispose(std::allocator 
> const&) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:249
> #4 0x7f6c090faed3 in std::string::reserve(unsigned long) 
> 

[jira] [Updated] (IMPALA-9767) ASAN crash during coordinator runtime filter updates

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9767:
--
Issue Type: Bug  (was: Test)

> ASAN crash during coordinator runtime filter updates
> 
>
> Key: IMPALA-9767
> URL: https://issues.apache.org/jira/browse/IMPALA-9767
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Fang-Yu Rao
>Priority: Major
>  Labels: asan, broken-build, crash
>
> ASAN crash output:
> {code:java}
> Error MessageAddress Sanitizer message detected in 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/logs/ee_tests/impalad.ERRORStandard
>  Error==4808==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7f6288cbe818 at pc 0x0199f6fe bp 0x7f63c1a8b270 sp 0x7f63c1a8aa20
> READ of size 1048576 at 0x7f6288cbe818 thread T73 (rpc reactor-552)
> #0 0x199f6fd in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x19a1f57 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x19a46c3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x372034d in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x331c095 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x3324da1 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x52ca4e2 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52ca4e2)
> #7 0x32aeadc in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x52cdb03 in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x52cdb03)
> #9 0x32aecd1 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x32c08db in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x2148c26 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x2144b29 in kudu::Thread::SuperviseThread(void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
> #13 0x7f6c0bcf4e24 in start_thread (/lib64/libpthread.so.0+0x7e24)
> #14 0x7f6c0885834c in __clone (/lib64/libc.so.6+0xf834c)
> 0x7f6288cbe818 is located 24 bytes inside of 1052640-byte region 
> [0x7f6288cbe800,0x7f6288dbf7e0)
> freed by thread T114 here:
> #0 0x1a773e0 in operator delete(void*) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
> #1 0x7f6c090faed3 in __gnu_cxx::new_allocator::deallocate(char*, 
> unsigned long) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:110
> #2 0x7f6c090faed3 in std::string::_Rep::_M_destroy(std::allocator 
> const&) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:449
> #3 0x7f6c090faed3 in std::string::_Rep::_M_dispose(std::allocator 
> const&) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.h:249
> #4 0x7f6c090faed3 in std::string::reserve(unsigned long) 
> /mnt/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:511
> #5 0x2781865 in 
> impala::ClientRequestState::UpdateFilter(impala::UpdateFilterParamsPB const&, 
> kudu::rpc::RpcContext*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/service/client-request-state.cc:1451:11
> #6 0x26d57d5 in 
> impala::ImpalaServer::UpdateFilter(impala::UpdateFilterResultPB*, 
> impala::UpdateFilterParamsPB const&, kudu::rpc::RpcContext*) 
> 

[jira] [Resolved] (IMPALA-7009) test_drop_table_with_purge fails on Isilon

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-7009.
---
Resolution: Cannot Reproduce

> test_drop_table_with_purge fails on Isilon
> --
>
> Key: IMPALA-7009
> URL: https://issues.apache.org/jira/browse/IMPALA-7009
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.13.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: broken-build, flaky
>
> We've seen multiple failures of test_drop_table_with_purge
> {code:java}
> metadata.test_ddl.TestDdlStatements.test_drop_table_with_purge (from pytest)
> Failing for the past 1 build (Since Failed#22 )
> Took 18 sec.
> add description
> Error Message
> metadata/test_ddl.py:72: in test_drop_table_with_purge assert not 
> self.filesystem_client.exists(\ E   assert not True E+  where True = 
>   0x5fe1210>>('user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2')
>  E+where  > = 
> .exists E  
>   +  where  0x5fe1210> =  0x5fe1110>.filesystem_client E+and   
> 'user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2'
>  = ('jenkins', 
> 'test_drop_table_with_purge_58c75c18') E+  where  format of str object at 0x3eba3f8> = 
> 'user/{0}/.Trash/Current/test-warehouse/{1}.db/t2'.format E+  and   
> 'jenkins' = () E+where  getuser at 0x1c08c80> = getpass.getuser
> Stacktrace
> metadata/test_ddl.py:72: in test_drop_table_with_purge
> assert not self.filesystem_client.exists(\
> E   assert not True
> E+  where True =   0x5fe1210>>('user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2')
> E+where  > = 
> .exists
> E+  where  0x5fe1210> =  0x5fe1110>.filesystem_client
> E+and   
> 'user/jenkins/.Trash/Current/test-warehouse/test_drop_table_with_purge_58c75c18.db/t2'
>  = ('jenkins', 
> 'test_drop_table_with_purge_58c75c18')
> E+  where  = 
> 'user/{0}/.Trash/Current/test-warehouse/{1}.db/t2'.format
> E+  and   'jenkins' = ()
> E+where  = getpass.getuser
> Standard Error
> -- connecting to: localhost:21000
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_drop_table_with_purge_58c75c18` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_drop_table_with_purge_58c75c18`;
> MainThread: Created database "test_drop_table_with_purge_58c75c18" for test 
> ID "metadata/test_ddl.py::TestDdlStatements::()::test_drop_table_with_purge"
> -- executing against localhost:21000
> create table test_drop_table_with_purge_58c75c18.t1(i int);
> -- executing against localhost:21000
> create table test_drop_table_with_purge_58c75c18.t2(i int);
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> -- executing against localhost:21000
> drop table test_drop_table_with_purge_58c75c18.t1;
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> -- executing against localhost:21000
> drop table test_drop_table_with_purge_58c75c18.t2 purge;
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> MainThread: Starting new HTTP connection (1): 10.17.95.12
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9185) TestBloomFilters.test_bloom_wait_time is flaky on S3

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9185.
---
Resolution: Cannot Reproduce

> TestBloomFilters.test_bloom_wait_time is flaky on S3
> 
>
> Key: IMPALA-9185
> URL: https://issues.apache.org/jira/browse/IMPALA-9185
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Priority: Major
>  Labels: broken-build, flaky
>
> The following test is flaky on S3, we should consider increasing the arrival 
> timeout; similar to what was done on ASAN builds in IMPALA-7104
> {code}
> query_test.test_runtime_filters.TestBloomFilters.test_bloom_wait_time[protocol:
>  beeswax | exec_option: \{'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> rc/snap/block] (from pytest)
> {code}
> Error Message
> {code}
> query_test/test_runtime_filters.py:135: in test_bloom_wait_time assert 
> duration_s < (WAIT_TIME_MS / 1000), \ E AssertionError: Query took too long 
> (200.51018095s, possibly waiting for missing filters?) E assert 
> 200.5101809501648 < (6 / 1000)
> {code}
> Stacktrace
> {code}
> query_test/test_runtime_filters.py:135: in test_bloom_wait_time assert 
> duration_s < (WAIT_TIME_MS / 1000), \ E AssertionError: Query took too long 
> (200.51018095s, possibly waiting for missing filters?) E assert 
> 200.5101809501648 < (6 / 1000)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9190) CatalogdMetaProviderTest.testPiggybackFailure is flaky

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9190.
---
Resolution: Duplicate

> CatalogdMetaProviderTest.testPiggybackFailure is flaky
> --
>
> Key: IMPALA-9190
> URL: https://issues.apache.org/jira/browse/IMPALA-9190
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Priority: Major
>  Labels: broken-build, flaky
>
> The following test is flaky:
> org.apache.impala.catalog.local.CatalogdMetaProviderTest.testPiggybackFailure
> Error Message
> {code}
> Did not see enough piggybacked loads!
> Stacktrace
> java.lang.AssertionError: Did not see enough piggybacked loads!
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProviderTest.doTestPiggyback(CatalogdMetaProviderTest.java:314)
>   at 
> org.apache.impala.catalog.local.CatalogdMetaProviderTest.testPiggybackFailure(CatalogdMetaProviderTest.java:273)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9184) TestImpalaShellInteractive.test_ddl_queries_are_closed is flaky

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-9184:
-

Assignee: Tim Armstrong

> TestImpalaShellInteractive.test_ddl_queries_are_closed is flaky
> ---
>
> Key: IMPALA-9184
> URL: https://issues.apache.org/jira/browse/IMPALA-9184
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: broken-build, flaky
>
> The following test is flaky:
> shell.test_shell_interactive.TestImpalaShellInteractive.test_ddl_queries_are_closed[table_format_and_file_extension:
>  ('textfile', '.txt') | protocol: beeswax] (from pytest)
> Error Message
> {code:java}
> AssertionError: drop query should be closed assert  ImpaladService.wait_for_num_in_flight_queries of 
> >(0) + where 
>  > = 
>  0x8a1fad0>.wait_for_num_in_flight_queries
> {code}
> Stacktrace
> {code:java}
> Impala/tests/shell/test_shell_interactive.py:338: in 
> test_ddl_queries_are_closed assert impalad.wait_for_num_in_flight_queries(0), 
> MSG % 'drop' E AssertionError: drop query should be closed E assert  method ImpaladService.wait_for_num_in_flight_queries of 
> >(0) E + 
> where  > = 
>  0x8a1fad0>.wait_for_num_in_flight_queries
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8764) Kudu data load failures due to "Clock considered unsynchronized"

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8764.
---
Resolution: Duplicate

> Kudu data load failures due to "Clock considered unsynchronized"
> 
>
> Key: IMPALA-8764
> URL: https://issues.apache.org/jira/browse/IMPALA-8764
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Sahil Takiar
>Priority: Major
>  Labels: broken-build
>
> Dataload error:
> {code}
> 03:08:38 03:08:38 Error executing impala SQL: 
> Impala/logs/data_loading/sql/functional/create-functional-query-exhaustive-impala-generated-kudu-none-none.sql
>  See: 
> Impala/logs/data_loading/sql/functional/create-functional-query-exhaustive-impala-generated-kudu-none-none.sql.log
> {code}
> Digging through the mini-cluster logs, I see that the Kudu tservers crashed 
> with this error:
> {code}
> F0715 02:58:43.202059   649 hybrid_clock.cc:339] Check failed: _s.ok() unable 
> to get current time with error bound: Service unavailable: could not read 
> system time source: Error reading clock. Clock considered unsynchronized
> *** Check failure stack trace: ***
> Wrote minidump to 
> Impala/testdata/cluster/cdh6/node-3/var/log/kudu/ts/minidumps/kudu-tserver/395e6bb9-9b2f-468e-4d37d898-74b96d61.dmp
> Wrote minidump to 
> Impala/testdata/cluster/cdh6/node-3/var/log/kudu/ts/minidumps/kudu-tserver/395e6bb9-9b2f-468e-4d37d898-74b96d61.dmp
> *** Aborted at 1563184723 (unix time) try "date -d @1563184723" if you are 
> using GNU date ***
> PC: @ 0x7ff75ed631f7 __GI_raise
> *** SIGABRT (@0x7d10232) received by PID 562 (TID 0x7ff756c1e700) from 
> PID 562; stack trace: ***
> @ 0x7ff760b545e0 (unknown)
> @ 0x7ff75ed631f7 __GI_raise
> @ 0x7ff75ed648e8 __GI_abort
> @  0x1fb7309 kudu::AbortFailureFunction()
> @   0x9c054d google::LogMessage::Fail()
> @   0x9c240d google::LogMessage::SendToLog()
> @   0x9c0089 google::LogMessage::Flush()
> @   0x9c2eaf google::LogMessageFatal::~LogMessageFatal()
> @   0xc0c60e kudu::clock::HybridClock::WalltimeWithErrorOrDie()
> @   0xc0c67e kudu::clock::HybridClock::NowWithError()
> @   0xc0d4aa kudu::clock::HybridClock::NowForMetrics()
> @   0x9a29c0 kudu::FunctionGauge<>::WriteValue()
> @  0x1fb0dc0 kudu::Gauge::WriteAsJson()
> @  0x1fb3212 kudu::MetricEntity::WriteAsJson()
> @  0x1fb390e kudu::MetricRegistry::WriteAsJson()
> @   0xa856a3 kudu::server::DiagnosticsLog::LogMetrics()
> @   0xa8789a kudu::server::DiagnosticsLog::RunThread()
> @  0x1ff44d7 kudu::Thread::SuperviseThread()
> @ 0x7ff760b4ce25 start_thread
> @ 0x7ff75ee2634d __clone
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-9491) Compilation failure in KuduUtil.java

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9491.
---
Resolution: Fixed

It seems like this is not occurring any more.

> Compilation failure in KuduUtil.java
> 
>
> Key: IMPALA-9491
> URL: https://issues.apache.org/jira/browse/IMPALA-9491
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: David Rorke
>Assignee: Csaba Ringhofer
>Priority: Blocker
>  Labels: broken-build
>
> Build is failing with the following:
> {noformat}
> 12:40:33 [INFO] BUILD FAILURE
> 12:40:33 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.3:compile (default-compile) 
> on project impala-frontend: Compilation failure: Compilation failure:
> 12:40:33 [ERROR] 
> /data0/jenkins/workspace/impala-asf-master-core/repos/Impala/fe/src/main/java/org/apache/impala/util/KuduUtil.java:[181,12]
>  an enum switch case label must be the unqualified name of an enumeration 
> constant
> 12:40:33 [ERROR] 
> /data0/jenkins/workspace/impala-asf-master-core/repos/Impala/fe/src/main/java/org/apache/impala/util/KuduUtil.java:[183,12]
>  cannot find symbol
> 12:40:33 [ERROR] symbol:   method addDate(int,java.sql.Date)
> 12:40:33 [ERROR] location: variable key of type 
> org.apache.kudu.client.PartialRow
> 12:40:33 [ERROR] 
> /data0/jenkins/workspace/impala-asf-master-core/repos/Impala/fe/src/main/java/org/apache/impala/util/KuduUtil.java:[239,12]
>  an enum switch case label must be the unqualified name of an enumeration 
> constant
> 12:40:33 [ERROR] 
> /data0/jenkins/workspace/impala-asf-master-core/repos/Impala/fe/src/main/java/org/apache/impala/util/KuduUtil.java:[442,45]
>  cannot find symbol
> 12:40:33 [ERROR] symbol:   variable DATE
> 12:40:33 [ERROR] location: class org.apache.kudu.Type
> 12:40:33 [ERROR] 
> /data0/jenkins/workspace/impala-asf-master-core/repos/Impala/fe/src/main/java/org/apache/impala/util/KuduUtil.java:[468,12]
>  an enum switch case label must be the unqualified name of an enumeration 
> constant
> {noformat}
> Likely related to this change:  https://gerrit.cloudera.org/#/c/14705/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Wenzhe Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172684#comment-17172684
 ] 

Wenzhe Zhou edited comment on IMPALA-10050 at 8/6/20, 11:18 PM:


*01:52:58* 
failure/test_failpoints.py::TestFailpoints::test_failpoints[protocol: beeswax | 
table_format: avro/snap/block | exec_option: 

{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0}

| mt_dop: 4 | location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 
from alltypessmall a join alltypessmall b on a.id = b.id] FAILED
h3. Error Message

ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:RPC from 
127.0.0.1:27000 to 127.0.0.1:27002 failed TransmitData() to 127.0.0.1:27002 
failed: Network error: Client connection negotiation failed: client connection 
to 127.0.0.1:27002: connect: Connection refused (error 111)
h3. Stacktrace

failure/test_failpoints.py:128: in test_failpoints self.execute_query(query, 
vector.get_value('exec_option')) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
 in wrapper return function(*args, **kwargs) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
 in execute_query return self.__execute_query(self.client, query, 
query_options) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
 in __execute_query return impalad_client.execute(query, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
 in execute return self.__beeswax_client.execute(sql_stmt, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute handle = self.__execute_query(query_string.strip(), user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query self.wait_for_finished(handle) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + 
error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E Query 
aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed E TransmitData() to 
127.0.0.1:27002 failed: Network error: Client connection negotiation failed: 
client connection to 127.0.0.1:27002: connect: Connection refused (error 111)

 

When backend_exec_state enter terminated state, it could be FINISHED, CANCELED, 
or ERROR. If it's ERROR, then "is_cancelled_" could be 0.  So the DCHECK 
expectation in QueryState::MonitorFInstances() is not right. It seems an old 
bug, not caused by [https://gerrit.cloudera.org/#/c/16215/].


was (Author: wzhou):
*01:52:58* 
failure/test_failpoints.py::TestFailpoints::test_failpoints[protocol: beeswax | 
table_format: avro/snap/block | exec_option: \\{'batch_size': 0, 'num_nodes': 
0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
alltypessmall a join alltypessmall b on a.id = b.id] FAILED
h3. Error Message

ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:RPC from 
127.0.0.1:27000 to 127.0.0.1:27002 failed TransmitData() to 127.0.0.1:27002 
failed: Network error: Client connection negotiation failed: client connection 
to 127.0.0.1:27002: connect: Connection refused (error 111)
h3. Stacktrace

failure/test_failpoints.py:128: in test_failpoints self.execute_query(query, 
vector.get_value('exec_option')) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
 in wrapper return function(*args, **kwargs) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
 in execute_query return self.__execute_query(self.client, query, 
query_options) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
 in __execute_query return impalad_client.execute(query, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
 in execute return self.__beeswax_client.execute(sql_stmt, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute handle = self.__execute_query(query_string.strip(), user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query self.wait_for_finished(handle) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in 

[jira] [Resolved] (IMPALA-9552) Data load failure due to HS2 connection failure

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9552.
---
Resolution: Cannot Reproduce

> Data load failure due to HS2 connection failure
> ---
>
> Key: IMPALA-9552
> URL: https://issues.apache.org/jira/browse/IMPALA-9552
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Sahil Takiar
>Priority: Major
>  Labels: broken-build, flakey
>
> A recent run of ubuntu-16.04-from-scratch: 
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/9921/ failed due to a 
> data load issue
> Data load failure: 
> load-functional-query-exhaustive-hive-generated-text-bzip-block.sql
> Log errors:
> {code}
> 20/03/25 15:37:17 [main]: ERROR jdbc.HiveConnection: Error opening session
> org.apache.thrift.transport.TTransportException: null
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>  ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:458) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
>  ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
>  ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) 
> ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:168)
>  ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.hive.service.rpc.thrift.TCLIService$Client.OpenSession(TCLIService.java:155)
>  ~[hive-exec-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:578) 
> [hive-jdbc-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:188) 
> [hive-jdbc-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) 
> [hive-jdbc-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at java.sql.DriverManager.getConnection(DriverManager.java:664) 
> [?:1.8.0_242]
> at java.sql.DriverManager.getConnection(DriverManager.java:208) 
> [?:1.8.0_242]
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
>  [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:209)
>  [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at org.apache.hive.beeline.Commands.connect(Commands.java:1617) 
> [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at org.apache.hive.beeline.Commands.connect(Commands.java:1512) 
> [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_242]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_242]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_242]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_242]
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:56)
>  [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1290) 
> [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1329) 
> [hive-beeline-2.1.1-cdh6.x-SNAPSHOT.jar:2.1.1-cdh6.x-SNAPSHOT]
>

[jira] [Commented] (IMPALA-9535) Test for conversion from non-ACID to ACID fail on newer Hive

2020-08-06 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172718#comment-17172718
 ] 

Tim Armstrong commented on IMPALA-9535:
---

[~joemcdonnell] did this get resolved somehow?


> Test for conversion from non-ACID to ACID fail on newer Hive
> 
>
> Key: IMPALA-9535
> URL: https://issues.apache.org/jira/browse/IMPALA-9535
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
>
> When using a newer CDP GBN, Hive is enforcing a strict separation between the 
> hive.metastore.warehouse.external.dir vs hive.metastore.warehouse.dir. This 
> causes failures in tests that convert a table from non-ACID to ACID on the 
> USE_CDP_HIVE=true configuration:
> {noformat}
> query_test/test_acid.py:52: in test_acid_basic
> self.run_test_case('QueryTest/acid', vector, use_db=unique_database)
> common/impala_test_suite.py:659: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:610: in __exec_in_hive
> result = h.execute(query, user=user)
> common/impala_connection.py:334: in execute
> r = self.__fetch_results(handle, profile_format=profile_format)
> common/impala_connection.py:441: in __fetch_results
> cursor._wait_to_finish()
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env/lib/python2.7/site-packages/impala/hiveserver2.py:412:
>  in _wait_to_finish
> raise OperationalError(resp.errorMessage)
> E   OperationalError: Error while compiling statement: FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.ddl.DDLTask. Unable to 
> alter table. A managed table's location needs to be under the hive warehouse 
> root 
> directory,table:upgraded_table,location:/test-warehouse/test_acid_basic_34c57c48.db/upgraded_table,Hive
>  warehouse:/test-warehouse/managed{noformat}
> This impacts the following tests:
> {noformat}
> query_test.test_acid.TestAcid.test_acid_basic
> query_test.test_acid.TestAcid.test_acid_compaction
> query_test.test_acid.TestAcid.test_acid_partitioned{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9879) ASAN use-after-free with KRPC thread and Coordinator::FilterState::ApplyUpdate()

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-9879:
-

Assignee: Tim Armstrong

> ASAN use-after-free  with KRPC thread and 
> Coordinator::FilterState::ApplyUpdate()
> -
>
> Key: IMPALA-9879
> URL: https://issues.apache.org/jira/browse/IMPALA-9879
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build
>
> An ASAN core run failed with the following Impalad crash:
>  
> {noformat}
> ==4348==ERROR: AddressSanitizer: heap-use-after-free on address 
> 0x7fc144423800 at pc 0x01a50071 bp 0x7fc26d7daa40 sp 0x7fc26d7da1f0
> READ of size 1048576 at 0x7fc144423800 thread T81 (rpc reactor-464)
> #0 0x1a50070 in read_iovec(void*, __sanitizer::__sanitizer_iovec*, 
> unsigned long, unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:904
> #1 0x1a666d1 in read_msghdr(void*, __sanitizer::__sanitizer_msghdr*, 
> long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2781
> #2 0x1a68fb3 in __interceptor_sendmsg 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:2796
> #3 0x38074dc in kudu::Socket::Writev(iovec const*, int, long*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/net/socket.cc:447:3
> #4 0x3411fa5 in kudu::rpc::OutboundTransfer::SendBuffer(kudu::Socket&) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/transfer.cc:227:26
> #5 0x341aa60 in kudu::rpc::Connection::WriteHandler(ev::io&, int) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/connection.cc:802:31
> #6 0x55ef342 in ev_invoke_pending 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x55ef342)
> #7 0x33a4d8c in kudu::rpc::ReactorThread::InvokePendingCb(ev_loop*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:196:3
> #8 0x55f29ef in ev_run 
> (/data0/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/build/debug/service/impalad+0x55f29ef)
> #9 0x33a4f81 in kudu::rpc::ReactorThread::RunThread() 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/rpc/reactor.cc:497:9
> #10 0x33b66bb in boost::_bi::bind_t kudu::rpc::ReactorThread>, 
> boost::_bi::list1 > 
> >::operator()() 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/bind/bind.hpp:1222:16
> #11 0x21ba196 in boost::function0::operator()() const 
> /data/jenkins/workspace/impala-asf-master-core-asan/Impala-Toolchain/toolchain-packages-gcc7.5.0/boost-1.61.0-p2/include/boost/function/function_template.hpp:770:14
> #12 0x21b6089 in kudu::Thread::SuperviseThread(void*) 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/be/src/kudu/util/thread.cc:675:3
> #13 0x7fcabb86be24 in start_thread (/lib64/libpthread.so.0+0x7e24)
> #14 0x7fcab833f34c in __clone (/lib64/libc.so.6+0xf834c)
> 0x7fc144423800 is located 0 bytes inside of 1048577-byte region 
> [0x7fc144423800,0x7fc144523801)
> freed by thread T108 here:
> #0 0x1ad6050 in operator delete(void*) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:137
> #1 0x7fcab8c425a9 in __gnu_cxx::new_allocator::deallocate(char*, 
> unsigned long) 
> /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:125
> #2 0x7fcab8c425a9 in std::allocator_traits 
> >::deallocate(std::allocator&, char*, unsigned long) 
> /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/alloc_traits.h:462
> #3 0x7fcab8c425a9 in std::__cxx11::basic_string std::char_traits, std::allocator >::_M_destroy(unsigned long) 
> /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.h:226
> #4 0x7fcab8c425a9 in std::__cxx11::basic_string std::char_traits, std::allocator >::reserve(unsigned long) 
> /mnt/source/gcc/build-7.5.0/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/basic_string.tcc:302
> previously allocated by thread T116 here:
> #0 0x1ad52e0 in operator new(unsigned long) 
> /mnt/source/llvm/llvm-5.0.1.src-p2/projects/compiler-rt/lib/asan/asan_new_delete.cc:92
> #1 0x1ad9fce in void std::__cxx11::basic_string std::char_traits, std::allocator >::_M_construct 

[jira] [Resolved] (IMPALA-9981) The BE test of buffer-pool-test seems flaky

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9981.
---
Resolution: Duplicate

> The BE test of buffer-pool-test seems flaky
> ---
>
> Key: IMPALA-9981
> URL: https://issues.apache.org/jira/browse/IMPALA-9981
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Abhishek Rawat
>Priority: Critical
>  Labels: broken-build, flaky
> Attachments: buffer-pool-test.ERROR, buffer-pool-test.FATAL, 
> buffer-pool-test.INFO, buffer-pool-test.WARNING
>
>
> We observed that the BE test of 
> [buffer-pool-test|https://github.com/apache/impala/blame/master/be/src/runtime/bufferpool/buffer-pool-test.cc#L1764]
>  failed in a recent UBSAN build with the following error message.
> {code:java}
> 3:56:54 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/runtime/bufferpool/buffer-pool-test.cc:1764:
>  Failure
> 13:56:54 Value of: FindPageInDir(pages[NO_ERROR_QUERY], error_dir) != NULL
> 13:56:54   Actual: false
> 13:56:54 Expected: true
> {code}
> Maybe [~tarmstrong] could offer some insight into this issue. For easy 
> reference, the related log files are also attached. Thanks!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10038) TestScannersFuzzing::()::test_fuzz_alltypes timed out after 2 hours (may be flaky)

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-10038:
--

Assignee: Tim Armstrong

> TestScannersFuzzing::()::test_fuzz_alltypes timed out after 2 hours (may be 
> flaky)
> --
>
> Key: IMPALA-10038
> URL: https://issues.apache.org/jira/browse/IMPALA-10038
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
> Environment: Centos 7.4, 16 vCPU, 64 GB RAM, data cache enabled
>Reporter: Laszlo Gaal
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build
>
> This was seen on Centos 7., with the data cache enabled during and exhaustive 
> run
> Test step:
> {code}
> query_test.test_scanners_fuzz.TestScannersFuzzing.test_fuzz_alltypes[protocol:
>  beeswax | exec_option: {'debug_action': 
> '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0', 'abort_on_error': False, 
> 'mem_limit': '512m', 'num_nodes': 0} | table_format: avro/none]
> {code}
> Test backtrace:
> {code}
> query_test/test_scanners_fuzz.py:82: in test_fuzz_alltypes
> self.run_fuzz_test(vector, src_db, table_name, unique_database, 
> table_name)
> query_test/test_scanners_fuzz.py:238: in run_fuzz_test
> result = self.execute_query(query, query_options = query_options)
> common/impala_test_suite.py:811: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:843: in execute_query
> return self.__execute_query(self.client, query, query_options)
> common/impala_test_suite.py:909: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:205: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:365: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:389: in wait_for_finished
> time.sleep(0.05)
> E   Failed: Timeout >7200s
> {code}
> Captured stderr:
> {code}
> ~ Stack of  (139967888086784) 
> ~
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 277, in _perform_spawn
> reply.run()
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 213, in run
> self._result = func(*args, **kwargs)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 954, in _thread_receiver
> msg = Message.from_io(io)
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 418, in from_io
> header = io.read(9)  # type 1, channel 4, payload 4
>   File 
> "/data/jenkins/workspace/impala-asf-master-exhaustive-data-cache/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/execnet/gateway_base.py",
>  line 386, in read
> data = self._read(numbytes-len(buf))
> ERROR:impala_test_suite:Should not throw error when abort_on_error=0: 
> 'Timeout >7200s'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-6984:
-

Assignee: (was: Tim Armstrong)

> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10023) Intelligently evict in-memory partitions for higher-cardinality partitioned top-N

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-10023.

Resolution: Invalid

I'm going to do this in the first version of the patch

> Intelligently evict in-memory partitions for higher-cardinality partitioned 
> top-N
> -
>
> Key: IMPALA-10023
> URL: https://issues.apache.org/jira/browse/IMPALA-10023
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Tim Armstrong
>Priority: Major
>
> The initial patch for IMPALA-9853 will have a placeholder policy for evicting 
> in-memory partitions. We could be smarter and try to return partitions that 
> are filtering input rows effectively.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10025) Avoid rebuilding in-memory heap during output phase of top-n

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-10025.

Resolution: Invalid

I'm going to do this in the first version of the patch

> Avoid rebuilding in-memory heap during output phase of top-n
> 
>
> Key: IMPALA-10025
> URL: https://issues.apache.org/jira/browse/IMPALA-10025
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> In the patch for IMPALA-9853, we reuse some code in the output phase that 
> necessitated building the in-memory heap from the sorter's output. This has 
> some inherent overhead that gets worse for larger limits and/or partition 
> counts.
> It would be better to have the sorter do a full sort on partition/order by 
> columns and then apply the limit while streaming the results back from the 
> sorter. In combination with IMPALA-10023 this would let us gracefully degrade 
> to doing something closer to a regular sort and probably let us bump 
> ANALYTIC_PUSHDOWN_THRESHOLD.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Wenzhe Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172684#comment-17172684
 ] 

Wenzhe Zhou edited comment on IMPALA-10050 at 8/6/20, 10:48 PM:


*01:52:58* 
failure/test_failpoints.py::TestFailpoints::test_failpoints[protocol: beeswax | 
table_format: avro/snap/block | exec_option: \\{'batch_size': 0, 'num_nodes': 
0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
alltypessmall a join alltypessmall b on a.id = b.id] FAILED
h3. Error Message

ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:RPC from 
127.0.0.1:27000 to 127.0.0.1:27002 failed TransmitData() to 127.0.0.1:27002 
failed: Network error: Client connection negotiation failed: client connection 
to 127.0.0.1:27002: connect: Connection refused (error 111)
h3. Stacktrace

failure/test_failpoints.py:128: in test_failpoints self.execute_query(query, 
vector.get_value('exec_option')) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
 in wrapper return function(*args, **kwargs) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
 in execute_query return self.__execute_query(self.client, query, 
query_options) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
 in __execute_query return impalad_client.execute(query, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
 in execute return self.__beeswax_client.execute(sql_stmt, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute handle = self.__execute_query(query_string.strip(), user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query self.wait_for_finished(handle) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + 
error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E Query 
aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed E TransmitData() to 
127.0.0.1:27002 failed: Network error: Client connection negotiation failed: 
client connection to 127.0.0.1:27002: connect: Connection refused (error 111)

 

When backend_exec_state enter terminated state, it could be FINISHED, CANCELED, 
or ERROR. If it's ERROR, then "is_cancelled_" could be 0.  Looks like it's old 
bug, not caused by [https://gerrit.cloudera.org/#/c/16215/].


was (Author: wzhou):
*01:52:58* 
failure/test_failpoints.py::TestFailpoints::test_failpoints[protocol: beeswax | 
table_format: avro/snap/block | exec_option: \{'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
alltypessmall a join alltypessmall b on a.id = b.id] FAILED
h3. Error Message

ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:RPC from 
127.0.0.1:27000 to 127.0.0.1:27002 failed TransmitData() to 127.0.0.1:27002 
failed: Network error: Client connection negotiation failed: client connection 
to 127.0.0.1:27002: connect: Connection refused (error 111)
h3. Stacktrace

failure/test_failpoints.py:128: in test_failpoints self.execute_query(query, 
vector.get_value('exec_option')) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
 in wrapper return function(*args, **kwargs) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
 in execute_query return self.__execute_query(self.client, query, 
query_options) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
 in __execute_query return impalad_client.execute(query, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
 in execute return self.__beeswax_client.execute(sql_stmt, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute handle = self.__execute_query(query_string.strip(), user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query self.wait_for_finished(handle) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + 

[jira] [Created] (IMPALA-10058) Kudu queries hit error "Unable to deserialize scan token"

2020-08-06 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-10058:
--

 Summary: Kudu queries hit error "Unable to deserialize scan token"
 Key: IMPALA-10058
 URL: https://issues.apache.org/jira/browse/IMPALA-10058
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Joe McDonnell


I have seen a few test runs fail with a large number of Kudu tests failing with:
{noformat}
ImpalaBeeswaxException: ImpalaBeeswaxException:  Query aborted:Unable to 
deserialize scan token for node with id '0' for Kudu table 
'impala::functional_kudu.alltypestiny': Not found: the table does not exist: 
table_name: ""{noformat}
In the Impalad log, the errors looks like this:
{noformat}
I0804 18:35:19.075631 18788 status.cc:129] 1c4fdf1a9de8d577:32d91ed50002] 
Unable to deserialize scan token for node with id '1' for Kudu table 
'impala::functional_kudu.alltypessmall': Not found: the table does not exist: 
table_name: ""
@  0x1cabde1  impala::Status::Status()
@  0x28e97a9  impala::KuduScanner::OpenNextScanToken()
@  0x284eb33  impala::KuduScanNode::ProcessScanToken()
@  0x284f15e  impala::KuduScanNode::RunScannerThread()
@  0x284e351  
_ZZN6impala12KuduScanNode17ThreadAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
@  0x284fa48  
_ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12KuduScanNode17ThreadAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
@  0x2062c1d  boost::function0<>::operator()()
@  0x26867cf  impala::Thread::SuperviseThread()
@  0x268e76c  boost::_bi::list5<>::operator()<>()
@  0x268e690  boost::_bi::bind_t<>::operator()()
@  0x268e651  boost::detail::thread_data<>::run()
@  0x3e616a1  thread_proxy
@ 0x7fe872ef4e24  start_thread
@ 0x7fe86f9c934c  __clone{noformat}
This error would be coming from the Kudu client in this code in KuduScanner:
{noformat}
  kudu::client::KuduScanner* scanner;
  KUDU_RETURN_IF_ERROR(kudu::client::KuduScanToken::DeserializeIntoScanner(
   scan_node_->kudu_client(), scan_token, ),
  BuildErrorString("Unable to deserialize scan token"));{noformat}
This has happened multiple times in the docker-based tests, but I have also 
seen a couple jobs with the normal test runs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Wenzhe Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172684#comment-17172684
 ] 

Wenzhe Zhou commented on IMPALA-10050:
--

*01:52:58* 
failure/test_failpoints.py::TestFailpoints::test_failpoints[protocol: beeswax | 
table_format: avro/snap/block | exec_option: \{'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
alltypessmall a join alltypessmall b on a.id = b.id] FAILED
h3. Error Message

ImpalaBeeswaxException: ImpalaBeeswaxException: Query aborted:RPC from 
127.0.0.1:27000 to 127.0.0.1:27002 failed TransmitData() to 127.0.0.1:27002 
failed: Network error: Client connection negotiation failed: client connection 
to 127.0.0.1:27002: connect: Connection refused (error 111)
h3. Stacktrace

failure/test_failpoints.py:128: in test_failpoints self.execute_query(query, 
vector.get_value('exec_option')) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
 in wrapper return function(*args, **kwargs) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
 in execute_query return self.__execute_query(self.client, query, 
query_options) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
 in __execute_query return impalad_client.execute(query, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
 in execute return self.__beeswax_client.execute(sql_stmt, user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
 in execute handle = self.__execute_query(query_string.strip(), user=user) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
 in __execute_query self.wait_for_finished(handle) 
/data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
 in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + 
error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E Query 
aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed E TransmitData() to 
127.0.0.1:27002 failed: Network error: Client connection negotiation failed: 
client connection to 127.0.0.1:27002: connect: Connection refused (error 111)

 

> DCHECK was hit possibly while executing TestFailpoints::test_failpoints
> ---
>
> Key: IMPALA-10050
> URL: https://issues.apache.org/jira/browse/IMPALA-10050
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Wenzhe Zhou
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit during  ASAN core e2e tests. Time-frame suggests that it 
> happened while executing TestFailpoints::test_failpoints e2e test.
> {code}
> 10:56:38  TestFailpoints.test_failpoints[protocol: beeswax | table_format: 
> avro/snap/block | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
> location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
> alltypessmall a join alltypessmall b on a.id = b.id] 
> 10:56:38 failure/test_failpoints.py:128: in test_failpoints
> 10:56:38 self.execute_query(query, vector.get_value('exec_option'))
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
>  in wrapper
> 10:56:38 return function(*args, **kwargs)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
>  in execute_query
> 10:56:38 return self.__execute_query(self.client, query, query_options)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
>  in __execute_query
> 10:56:38 return impalad_client.execute(query, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 10:56:38 return self.__beeswax_client.execute(sql_stmt, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 10:56:38 handle = self.__execute_query(query_string.strip(), user=user)
> 10:56:38 
> 

[jira] [Commented] (IMPALA-10039) Expr-test crash in ExprTest.LiteralExprs during core run

2020-08-06 Thread Wenzhe Zhou (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172663#comment-17172663
 ] 

Wenzhe Zhou commented on IMPALA-10039:
--

Thanks Joe for the script which can easily reproduce the issue.

Recent patch for [IMPALA-5746|http://issues.apache.org/jira/browse/IMPALA-5746] 
registers a callback function for the updating of cluster membership. The 
callback function cancels the queries scheduled by the failed coordinators. 
This callback function was called during Expr-test running. In some cases, it 
make QueryState::Cancel() get called before thread unsafe function 
QueryState::Init() is completed. Hence QueryState::Cancel()  is called with 
instances_prepared_barrier_ as nullptr, and cause crash.

There is another dead-lock. If QueryState::Cancel() is called right after 
QueryState::Init() return error with not null instances_prepared_barrier_ , 
QueryState::Cancel() wait on  instances_prepared_barrier_ forever since 
fragment instances are not executed and instances_prepared_barrier_ will not be 
notified.

To fix it, we should make QueryState::Cancel() to wait until QueryState::Init() 
is completed, and reset instances_prepared_barrier_ if Init() failed. Also 
checks if the process running for tests and only registers the callback 
function if it's not running for BE/FE tests.

 

> Expr-test crash in ExprTest.LiteralExprs during core run
> 
>
> Key: IMPALA-10039
> URL: https://issues.apache.org/jira/browse/IMPALA-10039
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Wenzhe Zhou
>Priority: Blocker
>  Labels: broken-build
>
> Expr-test crashed with a minidump during a core-mode run.
> The test log:
> {code}
>  22/123 Test  #22: expr-test ***Failed4.42 sec
> Turning perftools heap leak checking off
> seed = 1596358469
> Note: Google Test filter = Instantiations/ExprTest.*
> [==] Running 192 tests from 1 test case.
> [--] Global test environment set-up.
> [--] 192 tests from Instantiations/ExprTest
> 20/08/02 01:54:29 INFO util.JvmPauseMonitor: Starting JVM pause monitor
> Running without optimization passes.
> [ RUN  ] Instantiations/ExprTest.NullLiteral/0
> [   OK ] Instantiations/ExprTest.NullLiteral/0 (1 ms)
> [ RUN  ] Instantiations/ExprTest.NullLiteral/1
> [   OK ] Instantiations/ExprTest.NullLiteral/1 (1 ms)
> [ RUN  ] Instantiations/ExprTest.NullLiteral/2
> [   OK ] Instantiations/ExprTest.NullLiteral/2 (0 ms)
> [ RUN  ] Instantiations/ExprTest.LiteralConstruction/0
> [   OK ] Instantiations/ExprTest.LiteralConstruction/0 (4 ms)
> [ RUN  ] Instantiations/ExprTest.LiteralConstruction/1
> [   OK ] Instantiations/ExprTest.LiteralConstruction/1 (1 ms)
> [ RUN  ] Instantiations/ExprTest.LiteralConstruction/2
> [   OK ] Instantiations/ExprTest.LiteralConstruction/2 (2 ms)
> [ RUN  ] Instantiations/ExprTest.LiteralExprs/0
> Wrote minidump to 
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/be_tests/minidumps/unifiedbetests/3c669d32-0e5a-42d6-ae70e79b-9f91038f.dmp
> Wrote minidump to 
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/be_tests/minidumps/unifiedbetests/3c669d32-0e5a-42d6-ae70e79b-9f91038f.dmp
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x7f1e95c21c30, pid=4127, tid=0x7f1e3d322700
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_144-b01) (build 
> 1.8.0_144-b01)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.144-b01 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [libpthread.so.0+0x9c30]  pthread_mutex_lock+0x0
> #
> # Core dump written. Default location: 
> /data0/jenkins/workspace/impala-asf-master-core/repos/Impala/be/src/exprs/core
>  or core.4127
> #
> # An error report file with more information is saved as:
> # 
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/logs/hs_err_pid4127.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://bugreport.java.com/bugreport/crash.jsp
> #
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/be/build/debug//exprs/expr-test:
>  line 10:  4127 Aborted (core dumped) 
> ${IMPALA_HOME}/bin/run-jvm-binary.sh 
> ${IMPALA_HOME}/be/build/latest/service/unifiedbetests 
> --gtest_filter=${GTEST_FILTER} 
> --gtest_output=xml:${IMPALA_BE_TEST_LOGS_DIR}/${TEST_EXEC_NAME}.xml 
> -log_filename="${TEST_EXEC_NAME}" "$@"
> Traceback (most recent call last):
>   File 
> "/data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/junitxml_prune_notrun.py",
>  line 71, in 
> if __name__ == "__main__": main()
>   File 
> 

[jira] [Resolved] (IMPALA-10005) Impala can't read Snappy compressed text files on S3 or ABFS

2020-08-06 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-10005.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> Impala can't read Snappy compressed text files on S3 or ABFS
> 
>
> Key: IMPALA-10005
> URL: https://issues.apache.org/jira/browse/IMPALA-10005
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
> Fix For: Impala 4.0
>
>
> When reading snappy compressed text from S3 or ABFS on a release build, it 
> fails to decompress:
>  
> {noformat}
> I0723 21:19:43.712909 229706 status.cc:128] Snappy: RawUncompress failed
> @   0xae26c9  impala::Status::Status()
> @  0x107635b  impala::SnappyDecompressor::ProcessBlock()
> @  0x11b1f2d  
> impala::HdfsTextScanner::FillByteBufferCompressedFile()
> @  0x11b23ef  impala::HdfsTextScanner::FillByteBuffer()
> @  0x11af96f  impala::HdfsTextScanner::FillByteBufferWrapper()
> @  0x11b096b  impala::HdfsTextScanner::ProcessRange()
> @  0x11b2b31  impala::HdfsTextScanner::GetNextInternal()
> @  0x118644b  impala::HdfsScanner::ProcessSplit()
> @  0x11774c2  impala::HdfsScanNode::ProcessSplit()
> @  0x1178805  impala::HdfsScanNode::ScannerThread()
> @  0x1100f31  impala::Thread::SuperviseThread()
> @  0x1101a79  boost::detail::thread_data<>::run()
> @  0x16a3449  thread_proxy
> @ 0x7fc522befe24  start_thread
> @ 0x7fc522919bac  __clone{noformat}
> When using a debug build, Impala hits the following DCHECK:
>  
>  
> {noformat}
> F0723 23:45:12.849973 249653 hdfs-text-scanner.cc:197] Check failed: 
> stream_>file_desc()>file_compression != THdfsCompression::SNAPPY FE should 
> have generated SNAPPY_BLOCKED instead.{noformat}
> That DCHECK explains why it would fail to decompress. It is using the wrong 
> THdfsCompression.
> I reproduced this on master in my dev env by changing 
> FileSystemUtil::supportsStorageIds() to always return true. This emulates the 
> behavior on object stores like S3 and ABFS.
>  
> {noformat}
>   /**
>* Returns true if the filesystem supports storage UUIDs in BlockLocation 
> calls.
>*/
>   public static boolean supportsStorageIds(FileSystem fs) {
> return false;
>   }{noformat}
> This is specific to Snappy and does not appear to apply to other compression 
> codecs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9963) Implement ds_kll_n() function

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172648#comment-17172648
 ] 

ASF subversion and git services commented on IMPALA-9963:
-

Commit 87aeb2ad78e2106f1d8df84d4d84975c7cde5b5a in impala's branch 
refs/heads/master from Gabor Kaszab
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=87aeb2a ]

IMPALA-9963: Implement ds_kll_n() function

This function receives a serialized Apache DataSketches KLL sketch
and returns how many input values were fed into this sketch.

Change-Id: I166e87a468e68e888ac15fca7429ac2552dbb781
Reviewed-on: http://gerrit.cloudera.org:8080/16259
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Implement ds_kll_n() function
> -
>
> Key: IMPALA-9963
> URL: https://issues.apache.org/jira/browse/IMPALA-9963
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>
> ds_kll_n() receives a serialized Apache DataSketches KLL sketch and returns 
> how many values were fed into the sketch.
> Returns a bigint.
> Example:
> {code:java}
> select ds_kll_n(ds_kll_sketch(cast(int_col as float))) from table_name;
> +--+
> | _c0  |
> +--+
> | 6|
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10005) Impala can't read Snappy compressed text files on S3 or ABFS

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172649#comment-17172649
 ] 

ASF subversion and git services commented on IMPALA-10005:
--

Commit dbbd40308a6d1cef77bfe45e016e775c918e0539 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=dbbd403 ]

IMPALA-10005: Fix Snappy decompression for non-block filesystems

Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED
type compression in the backend. However, for non-block filesystems,
the frontend is incorrectly passing THdfsCompression::SNAPPY instead.
On debug builds, this leads to a DCHECK when trying to read
Snappy-compressed text. On release builds, it fails to decompress
the data.

This fixes the frontend to always pass THdfsCompression::SNAPPY_BLOCKED
for Snappy-compressed text.

This reworks query_test/test_compressed_formats.py to provide better
coverage:
 - Changed the RC and Seq test cases to verify that the file extension
   doesn't matter. Added Avro to this case as well.
 - Fixed the text case to use appropriate extensions (fixing IMPALA-9004)
 - Changed the utility function so it doesn't use Hive. This allows it
   to be enabled on non-HDFS filesystems like S3.
 - Changed the test to use unique_database and allow parallel execution.
 - Changed the test to run in the core job, so it now has coverage on
   the usual S3 test configuration. It is reasonably quick (1-2 minutes)
   and runs in parallel.

Testing:
 - Exhaustive job
 - Core s3 job
 - Changed the frontend to force it to use the code for non-block
   filesystems (i.e. the TFileSplitGeneratorSpec code) and
   verified that it is now able to read Snappy-compressed text.

Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Reviewed-on: http://gerrit.cloudera.org:8080/16278
Tested-by: Impala Public Jenkins 
Reviewed-by: Sahil Takiar 


> Impala can't read Snappy compressed text files on S3 or ABFS
> 
>
> Key: IMPALA-10005
> URL: https://issues.apache.org/jira/browse/IMPALA-10005
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>
> When reading snappy compressed text from S3 or ABFS on a release build, it 
> fails to decompress:
>  
> {noformat}
> I0723 21:19:43.712909 229706 status.cc:128] Snappy: RawUncompress failed
> @   0xae26c9  impala::Status::Status()
> @  0x107635b  impala::SnappyDecompressor::ProcessBlock()
> @  0x11b1f2d  
> impala::HdfsTextScanner::FillByteBufferCompressedFile()
> @  0x11b23ef  impala::HdfsTextScanner::FillByteBuffer()
> @  0x11af96f  impala::HdfsTextScanner::FillByteBufferWrapper()
> @  0x11b096b  impala::HdfsTextScanner::ProcessRange()
> @  0x11b2b31  impala::HdfsTextScanner::GetNextInternal()
> @  0x118644b  impala::HdfsScanner::ProcessSplit()
> @  0x11774c2  impala::HdfsScanNode::ProcessSplit()
> @  0x1178805  impala::HdfsScanNode::ScannerThread()
> @  0x1100f31  impala::Thread::SuperviseThread()
> @  0x1101a79  boost::detail::thread_data<>::run()
> @  0x16a3449  thread_proxy
> @ 0x7fc522befe24  start_thread
> @ 0x7fc522919bac  __clone{noformat}
> When using a debug build, Impala hits the following DCHECK:
>  
>  
> {noformat}
> F0723 23:45:12.849973 249653 hdfs-text-scanner.cc:197] Check failed: 
> stream_>file_desc()>file_compression != THdfsCompression::SNAPPY FE should 
> have generated SNAPPY_BLOCKED instead.{noformat}
> That DCHECK explains why it would fail to decompress. It is using the wrong 
> THdfsCompression.
> I reproduced this on master in my dev env by changing 
> FileSystemUtil::supportsStorageIds() to always return true. This emulates the 
> behavior on object stores like S3 and ABFS.
>  
> {noformat}
>   /**
>* Returns true if the filesystem supports storage UUIDs in BlockLocation 
> calls.
>*/
>   public static boolean supportsStorageIds(FileSystem fs) {
> return false;
>   }{noformat}
> This is specific to Snappy and does not appear to apply to other compression 
> codecs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9004) TestCompressedFormats is broken for text files

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172650#comment-17172650
 ] 

ASF subversion and git services commented on IMPALA-9004:
-

Commit dbbd40308a6d1cef77bfe45e016e775c918e0539 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=dbbd403 ]

IMPALA-10005: Fix Snappy decompression for non-block filesystems

Snappy-compressed text always uses THdfsCompression::SNAPPY_BLOCKED
type compression in the backend. However, for non-block filesystems,
the frontend is incorrectly passing THdfsCompression::SNAPPY instead.
On debug builds, this leads to a DCHECK when trying to read
Snappy-compressed text. On release builds, it fails to decompress
the data.

This fixes the frontend to always pass THdfsCompression::SNAPPY_BLOCKED
for Snappy-compressed text.

This reworks query_test/test_compressed_formats.py to provide better
coverage:
 - Changed the RC and Seq test cases to verify that the file extension
   doesn't matter. Added Avro to this case as well.
 - Fixed the text case to use appropriate extensions (fixing IMPALA-9004)
 - Changed the utility function so it doesn't use Hive. This allows it
   to be enabled on non-HDFS filesystems like S3.
 - Changed the test to use unique_database and allow parallel execution.
 - Changed the test to run in the core job, so it now has coverage on
   the usual S3 test configuration. It is reasonably quick (1-2 minutes)
   and runs in parallel.

Testing:
 - Exhaustive job
 - Core s3 job
 - Changed the frontend to force it to use the code for non-block
   filesystems (i.e. the TFileSplitGeneratorSpec code) and
   verified that it is now able to read Snappy-compressed text.

Change-Id: I0879f2fc0bf75bb5c15cecb845ece46a901601ac
Reviewed-on: http://gerrit.cloudera.org:8080/16278
Tested-by: Impala Public Jenkins 
Reviewed-by: Sahil Takiar 


> TestCompressedFormats is broken for text files
> --
>
> Key: IMPALA-9004
> URL: https://issues.apache.org/jira/browse/IMPALA-9004
> Project: IMPALA
>  Issue Type: Test
>Reporter: Sahil Takiar
>Priority: Major
>
> While working on IMPALA-8950, we made a fix to {{TestCompressedFormats}} so 
> that it actually checks the exit status of the {{hdfs dfs -cp}} command, 
> turns out that this command has been silently failing whenever 
> {{test_compressed_formats}} runs with {{file_format}} = {{text}}.
> For some reason, data load writes compressed text files with their 
> corresponding file compression suffix, but for compressed seq/rc files, it 
> does not:
> {code:java}
> hdfs dfs -ls /test-warehouse/tinytable_seq_*
> Found 1 items
> -rwxr-xr-x   3 systest supergroup325 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_bzip/00_0
> Found 1 items
> -rwxr-xr-x   3 systest supergroup215 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_def/00_0
> Found 1 items
> -rwxr-xr-x   3 systest supergroup260 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_gzip/00_0
> Found 1 items
> -rwxr-xr-x   3 systest supergroup301 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_record_bzip/00_0
> Found 1 items
> -rwxr-xr-x   3 systest supergroup209 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_record_def/00_0
> Found 1 items
> -rwxr-xr-x   3 systest supergroup242 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_record_gzip/00_0
> Found 1 items
> -rwxr-xr-x   3 systest supergroup233 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_record_snap/00_0
> Found 2 items
> -rwxr-xr-x   3 systest supergroup243 2019-08-22 14:32 
> /test-warehouse/tinytable_seq_snap/00_0
> hdfs dfs -ls /test-warehouse/tinytable_text_*
> Found 1 items
> -rwxr-xr-x   3 systest supergroup 59 2019-08-22 14:32 
> /test-warehouse/tinytable_text_bzip/00_0.bz2
> Found 1 items
> -rwxr-xr-x   3 systest supergroup 28 2019-08-22 14:32 
> /test-warehouse/tinytable_text_def/00_0.deflate
> Found 1 items
> -rwxr-xr-x   3 systest supergroup 40 2019-08-22 14:32 
> /test-warehouse/tinytable_text_gzip/00_0.gz
> Found 2 items
> -rwxr-xr-x   3 systest supergroup 87 2019-08-22 14:32 
> /test-warehouse/tinytable_text_lzo/00_0.lzo
> -rw-r--r--   3 systest supergroup  8 2019-08-22 14:42 
> /test-warehouse/tinytable_text_lzo/00_0.lzo.index
> Found 1 items
> -rwxr-xr-x   3 systest supergroup 41 2019-08-22 14:32 
> /test-warehouse/tinytable_text_snap/00_0.snappy{code}
> Not sure if that is by design or not, but it is causing the tests to fail for 
> all text files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Created] (IMPALA-10057) Impala logs during docker-based FE_TEST are massive

2020-08-06 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-10057:
--

 Summary: Impala logs during docker-based FE_TEST are massive
 Key: IMPALA-10057
 URL: https://issues.apache.org/jira/browse/IMPALA-10057
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Joe McDonnell


For the docker-based tests, the Impala logs generated during the FE_TEST are 
huge:

 
{noformat}
$ du -c -h fe_test/ee_tests
4.0Kfe_test/ee_tests/minidumps/statestored
4.0Kfe_test/ee_tests/minidumps/impalad
4.0Kfe_test/ee_tests/minidumps/catalogd
16K fe_test/ee_tests/minidumps
352Kfe_test/ee_tests/profiles
81G fe_test/ee_tests
81G total{noformat}
Creating a tarball of these logs takes 10 minutes. The Impalad/catalogd logs 
are filled with this error over and over:
{noformat}
E0805 06:08:45.485440 11219 TransactionKeepalive.java:137] Unexpected exception 
thrown
Java exception follows:
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: 
at 
org.apache.impala.common.TransactionKeepalive$DaemonThread.run(TransactionKeepalive.java:114)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: 
... 2 more{noformat}
Two interesting points:
 # The frontend tests are passing, so all of these errors in the impalad logs 
are not impacting tests.
 # These errors aren't happening in any of the other tests (ee tests, custom 
cluster tests, etc). These errors are not seen outside the docker-based tests.

A theory is that FE_TEST runs mvn to build and run the frontend tests. If there 
were some bad interaction of mvn with the docker filesystem in manipulating the 
~/.m2 directory, that may cause problems. One thing to try may be to copy the 
.m2 directory to make sure it is in the top docker layer (similar to what we do 
with kudu wal files).

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-10050:
--

Assignee: Wenzhe Zhou  (was: Riza Suminto)

> DCHECK was hit possibly while executing TestFailpoints::test_failpoints
> ---
>
> Key: IMPALA-10050
> URL: https://issues.apache.org/jira/browse/IMPALA-10050
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Wenzhe Zhou
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit during  ASAN core e2e tests. Time-frame suggests that it 
> happened while executing TestFailpoints::test_failpoints e2e test.
> {code}
> 10:56:38  TestFailpoints.test_failpoints[protocol: beeswax | table_format: 
> avro/snap/block | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
> location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
> alltypessmall a join alltypessmall b on a.id = b.id] 
> 10:56:38 failure/test_failpoints.py:128: in test_failpoints
> 10:56:38 self.execute_query(query, vector.get_value('exec_option'))
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
>  in wrapper
> 10:56:38 return function(*args, **kwargs)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
>  in execute_query
> 10:56:38 return self.__execute_query(self.client, query, query_options)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
>  in __execute_query
> 10:56:38 return impalad_client.execute(query, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 10:56:38 return self.__beeswax_client.execute(sql_stmt, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 10:56:38 handle = self.__execute_query(query_string.strip(), user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 10:56:38 self.wait_for_finished(handle)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 10:56:38 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 10:56:38 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 10:56:38 EQuery aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed
> 10:56:38 E   TransmitData() to 127.0.0.1:27002 failed: Network error: Client 
> connection negotiation failed: client connection to 127.0.0.1:27002: connect: 
> Connection refused (error 111)
> {code}
> Impalad log:
> {code}
> Log file created at: 2020/08/05 01:52:56
> Running on machine: 
> impala-ec2-centos74-r5-4xlarge-ondemand-017c.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 01:52:56.979769 17313 query-state.cc:803] 
> 3941a3d92a71e242:15c963f3] Check failed: is_cancelled_.Load() == 1 (0 
> vs. 1) 
> {code}
> Stack trace
> {code}
> Thread 368 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x43a1   rdi = 0x37e4
> rbp = 0x7efcd4c53080   rsp = 0x7efcd4c52d08
>  r8 = 0xr9 = 0x7efcd4c52b80
> r10 = 0x0008   r11 = 0x0206
> r12 = 0x093de7c0   r13 = 0x0086
> r14 = 0x093de7c4   r15 = 0x093d6de0
> rip = 0x7f05c9d231f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7efcd4c53250   rsp = 0x7efcd4c53090
> rip = 0x05727e3b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53130   r12 = 0x0fe01a982628
> r13 = 0x61d000da0a6c   r14 = 0x7efcd4c53250
> r15 = 0x7efcd4c53270   rip = 0x0572ba39
> Found by: call frame info
>  3  impalad!impala::QueryState::MonitorFInstances() [query-state.cc : 803 + 
> 0x45]
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53140   r12 

[jira] [Commented] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Thomas Tauber-Marshall (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172571#comment-17172571
 ] 

Thomas Tauber-Marshall commented on IMPALA-10050:
-

I suspect this may have been caused by https://gerrit.cloudera.org/#/c/16215/ 
so [~wzhou] might be the right person to take a look

> DCHECK was hit possibly while executing TestFailpoints::test_failpoints
> ---
>
> Key: IMPALA-10050
> URL: https://issues.apache.org/jira/browse/IMPALA-10050
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit during  ASAN core e2e tests. Time-frame suggests that it 
> happened while executing TestFailpoints::test_failpoints e2e test.
> {code}
> 10:56:38  TestFailpoints.test_failpoints[protocol: beeswax | table_format: 
> avro/snap/block | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
> location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
> alltypessmall a join alltypessmall b on a.id = b.id] 
> 10:56:38 failure/test_failpoints.py:128: in test_failpoints
> 10:56:38 self.execute_query(query, vector.get_value('exec_option'))
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
>  in wrapper
> 10:56:38 return function(*args, **kwargs)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
>  in execute_query
> 10:56:38 return self.__execute_query(self.client, query, query_options)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
>  in __execute_query
> 10:56:38 return impalad_client.execute(query, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 10:56:38 return self.__beeswax_client.execute(sql_stmt, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 10:56:38 handle = self.__execute_query(query_string.strip(), user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 10:56:38 self.wait_for_finished(handle)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 10:56:38 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 10:56:38 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 10:56:38 EQuery aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed
> 10:56:38 E   TransmitData() to 127.0.0.1:27002 failed: Network error: Client 
> connection negotiation failed: client connection to 127.0.0.1:27002: connect: 
> Connection refused (error 111)
> {code}
> Impalad log:
> {code}
> Log file created at: 2020/08/05 01:52:56
> Running on machine: 
> impala-ec2-centos74-r5-4xlarge-ondemand-017c.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 01:52:56.979769 17313 query-state.cc:803] 
> 3941a3d92a71e242:15c963f3] Check failed: is_cancelled_.Load() == 1 (0 
> vs. 1) 
> {code}
> Stack trace
> {code}
> Thread 368 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x43a1   rdi = 0x37e4
> rbp = 0x7efcd4c53080   rsp = 0x7efcd4c52d08
>  r8 = 0xr9 = 0x7efcd4c52b80
> r10 = 0x0008   r11 = 0x0206
> r12 = 0x093de7c0   r13 = 0x0086
> r14 = 0x093de7c4   r15 = 0x093d6de0
> rip = 0x7f05c9d231f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7efcd4c53250   rsp = 0x7efcd4c53090
> rip = 0x05727e3b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53130   r12 = 0x0fe01a982628
> r13 = 0x61d000da0a6c   r14 = 0x7efcd4c53250
> r15 = 0x7efcd4c53270   rip = 0x0572ba39
> Found by: call frame info
>  3  

[jira] [Created] (IMPALA-10056) Keep the HDFS / Kudu cluster logs for the docker-based tests

2020-08-06 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-10056:
--

 Summary: Keep the HDFS / Kudu cluster logs for the docker-based 
tests
 Key: IMPALA-10056
 URL: https://issues.apache.org/jira/browse/IMPALA-10056
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 4.0
Reporter: Joe McDonnell


The Impala test environment has a symlink from logs/cluster/cdh7-node-* to 
locations in testdata/cluster. When running the docker-based tests, the logs/ 
directory is preserved beyond the lifetime of the container. However, 
testdata/cluster is not preserved, so the symlinks are not valid and those logs 
are not currently preserved.

The HDFS and Kudu logs in logs/cluster/cdh7-node-* are very useful, so we 
should preserve them. One option is to copy them to the logs/ directory just 
before stopping the container.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172559#comment-17172559
 ] 

Tim Armstrong commented on IMPALA-6984:
---

I think we need to avoid the self-RPCs at a minimum - IMPALA-5119 . We might 
also want to parallelise the cancel RPCs.

> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9985) CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")

2020-08-06 Thread Laszlo Gaal (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172543#comment-17172543
 ] 

Laszlo Gaal commented on IMPALA-9985:
-

The two Red Hat tickets state that the glibc bug causing both problems were 
fixed, and a new glibc release was shipped. Checking the glibc versions 
released in Centos 8.1 and the recent 8.2 version shows that glibc was indeed 
bumped from glibc-2.28-72.el8 (in 8.1) to glibc-2.28-101.el8 (in 8.2).
Rebuilding the toolchain with Centos 8.2 (thus using glibc-2.28-101 during gcc 
compilation) fixes the build break.
Interestingly no other fixes were necessary, even the manual removal of the 
file "​lib/gcc/x86_64-pc-linux-gnu/7.5.0/include-fixed/bits/statx.h" (which was 
still generated) was not necessary when using Centos 8.2 as the build and test 
platform.
This means the eventual fix could be delegated to the toolchain, then picking 
up the new build of the toolchain for Impala. This would of course change the 
minimum Centos 8 version required for Impala to Centos/Red Hat 8.2 (and 
derivatives).

> CentOS 8 builds break with __glibc_has_include ("__linux__/stat.h")
> ---
>
> Key: IMPALA-9985
> URL: https://issues.apache.org/jira/browse/IMPALA-9985
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Laszlo Gaal
>Assignee: Laszlo Gaal
>Priority: Blocker
>
> Currently Docker-based builds are running; they are breaking early in the 
> build, during virtualenv construction, when the Python bitarray module is 
> compiled:
> {code}
> 2020-07-21 07:44:32.913375 Complete output from command 
> /home/impdev/Impala/bin/../infra/python/env-gcc7.5.0/bin/python -c "import 
> setuptools, 
> tokenize;__file__='/tmp/pip-build-NK6_23/bitarray/setup.py';exec(compile(getattr(tokenize,
>  'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" 
> install --record /tmp/pip-L3NjnK-record/install-record.txt 
> --single-version-externally-managed --compile --install-headers 
> /home/impdev/Impala/bin/../infra/python/env-gcc7.5.0/include/site/python2.7/bitarray:
> 2020-07-21 07:44:32.913393 running install
> 2020-07-21 07:44:32.913411 running build
> 2020-07-21 07:44:32.913430 running build_py
> 2020-07-21 07:44:32.913447 creating build
> 2020-07-21 07:44:32.913476 creating build/lib.linux-x86_64-2.7
> 2020-07-21 07:44:32.913510 creating build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913553 copying bitarray/util.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913599 copying bitarray/test_util.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913645 copying bitarray/__init__.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913695 copying bitarray/test_bitarray.py -> 
> build/lib.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.913715 running build_ext
> 2020-07-21 07:44:32.913746 building 'bitarray._bitarray' extension
> 2020-07-21 07:44:32.913775 creating build/temp.linux-x86_64-2.7
> 2020-07-21 07:44:32.913809 creating build/temp.linux-x86_64-2.7/bitarray
> 2020-07-21 07:44:32.914022 ccache 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/bin/gcc 
> -fno-strict-aliasing -I/usr/include/ncurses 
> -I/mnt/build/bzip2-1.0.6-p2/include -DNDEBUG -g -fwrapv -O3 -Wall 
> -Wstrict-prototypes -fPIC 
> -I/home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7
>  -c bitarray/_bitarray.c -o build/temp.linux-x86_64-2.7/bitarray/_bitarray.o
> 2020-07-21 07:44:32.914059 In file included from 
> /usr/include/sys/stat.h:446:0,
> 2020-07-21 07:44:32.914135  from 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7/pyport.h:390,
> 2020-07-21 07:44:32.914210  from 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/include/python2.7/Python.h:61,
> 2020-07-21 07:44:32.914244  from bitarray/_bitarray.c:12:
> 2020-07-21 07:44:32.914350 
> /home/impdev/Impala/toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib/gcc/x86_64-pc-linux-gnu/7.5.0/include-fixed/bits/statx.h:38:25:
>  error: missing binary operator before token "("
> 2020-07-21 07:44:32.914384  #if __glibc_has_include ("__linux__/stat.h")
> 2020-07-21 07:44:32.914408  ^
> 2020-07-21 07:44:32.91 error: command 'ccache' failed with exit 
> status 1
> 20{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, 

[jira] [Resolved] (IMPALA-10047) Performance regression on short queries due to IMPALA-6984 fix

2020-08-06 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-10047.

Fix Version/s: Impala 4.0
   Resolution: Fixed

The core part of IMPALA-6684 was reverted, and subsequent testing found this 
regression eliminated.

> Performance regression on short queries due to IMPALA-6984 fix
> --
>
> Key: IMPALA-10047
> URL: https://issues.apache.org/jira/browse/IMPALA-10047
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
> Fix For: Impala 4.0
>
>
> When doing some TPC-DS benchmarking with mt_dop, we encountered intermittent 
> performance regressions on short queries. Some query executions seem to be 
> taking an extra 10 seconds in exec status reports due to delays in sending a 
> cancel RPC. From the coordinator logs:
>  
> {noformat}
> W0804 02:52:33.922088   108 rpcz_store.cc:253] Call 
> impala.ControlService.CancelQueryFInstances from 127.0.0.1:46738 (request 
> call id 3134) took 10007 ms (10 s). Client timeout 1 ms (10 s)
> W0804 02:52:33.922143   108 rpcz_store.cc:259] Trace:
> 0804 02:52:23.914291 (+ 0us) impala-service-pool.cc:170] Inserting onto 
> call queue
> 0804 02:52:33.922079 (+10007788us) impala-service-pool.cc:255] Skipping call 
> since client already timed out
> 0804 02:52:33.922086 (+ 7us) inbound_call.cc:162] Queueing failure 
> response
> Metrics: {}
> I0804 02:52:33.922214   101 connection.cc:730] Got response to call id 3134 
> after client already timed out or cancelled
> I0804 02:52:33.923286 20276 coordinator-backend-state.cc:889] 
> query_id=f442e73a0d35c136:c9993d77 target backend=xx.xx.xx.xx:27000: 
> Sending CancelQueryFInstances rpc{noformat}
> The rpcz page also shows that some ReportExecStatus RPCs are taking 10 
> seconds:
>  
>  
> {noformat}
> "incoming_queue_time": "Count: 671901, min / max: 1000.000ns / 10s347ms, 25th 
> %-ile: 12.000us, 50th %-ile: 18.000us, 75th %-ile: 28.000us, 90th %-ile: 
> 67.000us, 95th %-ile: 456.000us, 99.9th %-ile: 10s133ms",
> {
>   "method_name": "ReportExecStatus",
>   "handler_latency": "Count: 169653, min / max: 38.000us / 
> 10s173ms, 25th %-ile: 9.024ms, 50th %-ile: 20.352ms, 75th %-ile: 35.840ms, 
> 90th %-ile: 94.720ms, 95th %-ile: 177.152ms, 99.9th %-ile: 10s027ms",
>   "payload_size": "Count: 169653, min / max: 5.81 KB / 3.81 MB, 
> 25th %-ile: 425.00 KB, 50th %-ile: 760.00 KB, 75th %-ile: 1.47 MB, 90th 
> %-ile: 1.96 MB, 95th %-ile: 2.31 MB, 99.9th %-ile: 3.73 MB"
>   }]{noformat}
>  
> IMPALA-6984 introduced a Coordinator::CancelBackends() call to 
> Coordinator::HandleExecStateTransition() for the ExecState::RETURNED_RESULTS 
> case:
> {noformat}
>   if (new_state == ExecState::RETURNED_RESULTS) {
> // Cancel all backends, but wait for the final status reports to be 
> received so that
> // we have a complete profile for this successful query.
> CancelBackends(/*fire_and_forget=*/ false);
> WaitForBackends();
>   } else {
> CancelBackends(/*fire_and_forget=*/ true);
>   }{noformat}
> Removing this call eliminates the performance regression, so it will need 
> more investigation.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reopened IMPALA-6984:
---

Since the core part of this was reverted, reopening.

> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172457#comment-17172457
 ] 

ASF subversion and git services commented on IMPALA-6984:
-

Commit c413f9b558d51de877f497590baf14139ad5cf99 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c413f9b ]

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
Reviewed-on: http://gerrit.cloudera.org:8080/16288
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172454#comment-17172454
 ] 

ASF subversion and git services commented on IMPALA-6984:
-

Commit c413f9b558d51de877f497590baf14139ad5cf99 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c413f9b ]

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
Reviewed-on: http://gerrit.cloudera.org:8080/16288
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172456#comment-17172456
 ] 

ASF subversion and git services commented on IMPALA-6984:
-

Commit c413f9b558d51de877f497590baf14139ad5cf99 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c413f9b ]

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
Reviewed-on: http://gerrit.cloudera.org:8080/16288
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10047) Performance regression on short queries due to IMPALA-6984 fix

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172453#comment-17172453
 ] 

ASF subversion and git services commented on IMPALA-10047:
--

Commit c413f9b558d51de877f497590baf14139ad5cf99 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c413f9b ]

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
Reviewed-on: http://gerrit.cloudera.org:8080/16288
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Performance regression on short queries due to IMPALA-6984 fix
> --
>
> Key: IMPALA-10047
> URL: https://issues.apache.org/jira/browse/IMPALA-10047
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>
> When doing some TPC-DS benchmarking with mt_dop, we encountered intermittent 
> performance regressions on short queries. Some query executions seem to be 
> taking an extra 10 seconds in exec status reports due to delays in sending a 
> cancel RPC. From the coordinator logs:
>  
> {noformat}
> W0804 02:52:33.922088   108 rpcz_store.cc:253] Call 
> impala.ControlService.CancelQueryFInstances from 127.0.0.1:46738 (request 
> call id 3134) took 10007 ms (10 s). Client timeout 1 ms (10 s)
> W0804 02:52:33.922143   108 rpcz_store.cc:259] Trace:
> 0804 02:52:23.914291 (+ 0us) impala-service-pool.cc:170] Inserting onto 
> call queue
> 0804 02:52:33.922079 (+10007788us) impala-service-pool.cc:255] Skipping call 
> since client already timed out
> 0804 02:52:33.922086 (+ 7us) inbound_call.cc:162] Queueing failure 
> response
> Metrics: {}
> I0804 02:52:33.922214   101 connection.cc:730] Got response to call id 3134 
> after client already timed out or cancelled
> I0804 02:52:33.923286 20276 coordinator-backend-state.cc:889] 
> query_id=f442e73a0d35c136:c9993d77 target backend=xx.xx.xx.xx:27000: 
> Sending CancelQueryFInstances rpc{noformat}
> The rpcz page also shows that some ReportExecStatus RPCs are taking 10 
> seconds:
>  
>  
> {noformat}
> "incoming_queue_time": "Count: 671901, min / max: 1000.000ns / 10s347ms, 25th 
> %-ile: 12.000us, 50th %-ile: 18.000us, 75th %-ile: 28.000us, 90th %-ile: 
> 67.000us, 95th %-ile: 456.000us, 99.9th %-ile: 10s133ms",
> {
>   "method_name": "ReportExecStatus",
>   "handler_latency": "Count: 169653, min / max: 38.000us / 
> 10s173ms, 25th %-ile: 9.024ms, 50th %-ile: 20.352ms, 75th %-ile: 35.840ms, 
> 90th %-ile: 94.720ms, 95th %-ile: 177.152ms, 99.9th %-ile: 10s027ms",
>   "payload_size": "Count: 169653, min / max: 5.81 KB / 3.81 MB, 
> 25th %-ile: 425.00 KB, 50th %-ile: 760.00 KB, 75th %-ile: 1.47 MB, 90th 
> %-ile: 1.96 MB, 95th %-ile: 2.31 MB, 99.9th %-ile: 3.73 MB"
>   }]{noformat}
>  
> IMPALA-6984 introduced a Coordinator::CancelBackends() call to 
> Coordinator::HandleExecStateTransition() for the ExecState::RETURNED_RESULTS 
> case:
> {noformat}
>   if (new_state == ExecState::RETURNED_RESULTS) {
> // Cancel all backends, but wait for the final status reports to be 
> received so that
> // we have a complete profile for this successful query.
> CancelBackends(/*fire_and_forget=*/ false);
> WaitForBackends();
>   } else {
> CancelBackends(/*fire_and_forget=*/ true);
>   }{noformat}
> Removing this call eliminates the performance regression, so it will need 
> more investigation.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6984) Coordinator should cancel backends when returning EOS

2020-08-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172455#comment-17172455
 ] 

ASF subversion and git services commented on IMPALA-6984:
-

Commit c413f9b558d51de877f497590baf14139ad5cf99 in impala's branch 
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c413f9b ]

IMPALA-10047: Revert core piece of IMPALA-6984

Performance testing on TPC-DS found a peformance regression
on short queries due to delayed exec status reports. Further
testing traced this back to IMPALA-6984's behavior of
cancelling backends on EOS. The coordinator log show that
CancelBackends() call intermittently taking 10 seconds due
to timing out in the RPC layer.

As a temporary workaround, this reverts the core part of
IMPALA-6984 that added that CancelBackends() call for EOS.
It leaves the rest of IMPALA-6984 intact, as other code has built
on top of it.

Testing:
 - Core job
 - Performance tests

Change-Id: Ibf00a56e91f0376eaaa552e3bb4763501bfb49e8
(cherry picked from commit b91f3c0e064d592f3cdf2a2e089ca6546133ba55)
Reviewed-on: http://gerrit.cloudera.org:8080/16288
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Coordinator should cancel backends when returning EOS
> -
>
> Key: IMPALA-6984
> URL: https://issues.apache.org/jira/browse/IMPALA-6984
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Daniel Hecht
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: query-lifecycle
> Fix For: Impala 4.0
>
>
> Currently, the Coordinator waits for backends rather than proactively 
> cancelling them in the case of hitting EOS. There's a tangled mess that makes 
> it tricky to proactively cancel the backends related to how 
> {{Coordinator::ComputeQuerySummary()}} works – we can't update the summary 
> until the profiles are no longer changing (which also makes sense given that 
> we want the exec summary to be consistent with the final profile).  But we 
> current tie together the FIS status and the profile, and cancellation of 
> backends causes the FIS to return CANCELLED, which then means that the 
> remaining FIS on that backend won't produce a final profile.
> With the rework of the protocol for IMPALA-2990 we should make it possible to 
> sort this out such that a final profile can be requested regardless of how a 
> FIS ends execution.
> This also relates to IMPALA-5783.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172433#comment-17172433
 ] 

Riza Suminto commented on IMPALA-10054:
---

Hi [~attilaj] , the query estimate is probably off somehow or the mem limit 
specified in not enough anymore. I'll look into this.

> test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
> ---
>
> Key: IMPALA-10054
> URL: https://issues.apache.org/jira/browse/IMPALA-10054
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>
> test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
> flaky.
> Jenkins job that triggered the error:
> https://jenkins.impala.io/job/parallel-all-tests-nightly/1173
> Failing job:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/
> {code}
> Stacktrace
> query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
> assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
> E   assert ('SpilledRuns: ' + '3') in 'Query 
> (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil... 27.999ms\n   
>   - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
> WriteIoWaitTime: 0.000ns\n' = 
>  0x7f51da77fb50>.runtime_profile
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Attila Jeges (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172429#comment-17172429
 ] 

Attila Jeges commented on IMPALA-10054:
---

[~rizaon] I'm assigning this to you as the failing test was introduced in 
IMPALA-6692 that you worked on.

> test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
> ---
>
> Key: IMPALA-10054
> URL: https://issues.apache.org/jira/browse/IMPALA-10054
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>
> test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
> flaky.
> Jenkins job that triggered the error:
> https://jenkins.impala.io/job/parallel-all-tests-nightly/1173
> Failing job:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/
> {code}
> Stacktrace
> query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
> assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
> E   assert ('SpilledRuns: ' + '3') in 'Query 
> (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil... 27.999ms\n   
>   - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
> WriteIoWaitTime: 0.000ns\n' = 
>  0x7f51da77fb50>.runtime_profile
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10055) DCHECK was hit while executing e2e test TestQueries::test_subquery

2020-08-06 Thread Attila Jeges (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172422#comment-17172422
 ] 

Attila Jeges commented on IMPALA-10055:
---

[~boroknagyz] I'm assigning this to you as it looks like the bug was introduced 
in IMPALA-9515.

> DCHECK was hit while executing e2e test TestQueries::test_subquery
> --
>
> Key: IMPALA-10055
> URL: https://issues.apache.org/jira/browse/IMPALA-10055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit while executing e2e test. Time frame suggests that it 
> possibly happened while executing TestQueries::test_subquery:
> {code}
> query_test/test_queries.py:149: in test_subquery
> self.run_test_case('QueryTest/subquery', vector)
> common/impala_test_suite.py:662: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:600: in __exec_in_impala
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:909: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:334: in execute
> r = self.__fetch_results(handle, profile_format=profile_format)
> common/impala_connection.py:436: in __fetch_results
> result_tuples = cursor.fetchall()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:532:
>  in fetchall
> self._wait_to_finish()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:405:
>  in _wait_to_finish
> resp = self._last_operation._rpc('GetOperationStatus', req)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:992:
>  in _rpc
> response = self._execute(func_name, request)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:1023:
>  in _execute
> .format(self.retries))
> E   HiveServer2Error: Failed after retrying 3 times
> {code}
> impalad log:
> {code}
> Log file created at: 2020/08/05 17:34:30
> Running on machine: 
> impala-ec2-centos74-m5-4xlarge-ondemand-18a5.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 17:34:30.003247 10887 orc-column-readers.cc:423] 
> c34e87376f496a53:7ba6a2e40002] Check failed: (scanner_->row_batches_nee
> d_validation_ && scanner_->scan_node_->IsZeroSlotTableScan()) || 
> scanner_->acid_original_file
> {code}
> Stack trace:
> {code}
> CORE: ./fe/core.1596674070.14179.impalad
> BINARY: ./be/build/latest/service/impalad
> Core was generated by 
> `/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> To enable execution of this file add
>   add-auto-load-safe-path 
> /data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib64/libstdc++.so.6.0.24-gdb.py
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> To completely disable this security protection add
>   set auto-load safe-path /
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
>   info "(gdb)Auto-loading safe path"
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> #1  0x7efd6ec6f8e8 in abort () from /lib64/libc.so.6
> #2  0x086b8ea4 in google::DumpStackTraceAndExit() ()
> #3  0x086ae25d in google::LogMessage::Fail() ()
> #4  0x086afb4d in google::LogMessage::SendToLog() ()
> #5  0x086adbbb in google::LogMessage::Flush() ()
> #6  0x086b17b9 in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x0388e10a in impala::OrcStructReader::TopLevelReadValueBatch 
> (this=0x61162630, scratch_batch=0x824831e0, pool=0x82483258) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/orc-column-readers.cc:421
> #8  0x03810c92 in impala::HdfsOrcScanner::TransferTuples 
> (this=0x27143c00, dst_batch=0x2e5ca820) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:808
> #9  0x03814e2a in 

[jira] [Updated] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10054:
--
Issue Type: Bug  (was: Improvement)

> test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
> ---
>
> Key: IMPALA-10054
> URL: https://issues.apache.org/jira/browse/IMPALA-10054
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>
> test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
> flaky.
> Jenkins job that triggered the error:
> https://jenkins.impala.io/job/parallel-all-tests-nightly/1173
> Failing job:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/
> {code}
> Stacktrace
> query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
> assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
> E   assert ('SpilledRuns: ' + '3') in 'Query 
> (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil... 27.999ms\n   
>   - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
> WriteIoWaitTime: 0.000ns\n' = 
>  0x7f51da77fb50>.runtime_profile
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10050:
--
Labels: broken-build crash flaky  (was: broken-build crash)

> DCHECK was hit possibly while executing TestFailpoints::test_failpoints
> ---
>
> Key: IMPALA-10050
> URL: https://issues.apache.org/jira/browse/IMPALA-10050
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit during  ASAN core e2e tests. Time-frame suggests that it 
> happened while executing TestFailpoints::test_failpoints e2e test.
> {code}
> 10:56:38  TestFailpoints.test_failpoints[protocol: beeswax | table_format: 
> avro/snap/block | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
> location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
> alltypessmall a join alltypessmall b on a.id = b.id] 
> 10:56:38 failure/test_failpoints.py:128: in test_failpoints
> 10:56:38 self.execute_query(query, vector.get_value('exec_option'))
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
>  in wrapper
> 10:56:38 return function(*args, **kwargs)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
>  in execute_query
> 10:56:38 return self.__execute_query(self.client, query, query_options)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
>  in __execute_query
> 10:56:38 return impalad_client.execute(query, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 10:56:38 return self.__beeswax_client.execute(sql_stmt, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 10:56:38 handle = self.__execute_query(query_string.strip(), user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 10:56:38 self.wait_for_finished(handle)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 10:56:38 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 10:56:38 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 10:56:38 EQuery aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed
> 10:56:38 E   TransmitData() to 127.0.0.1:27002 failed: Network error: Client 
> connection negotiation failed: client connection to 127.0.0.1:27002: connect: 
> Connection refused (error 111)
> {code}
> Impalad log:
> {code}
> Log file created at: 2020/08/05 01:52:56
> Running on machine: 
> impala-ec2-centos74-r5-4xlarge-ondemand-017c.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 01:52:56.979769 17313 query-state.cc:803] 
> 3941a3d92a71e242:15c963f3] Check failed: is_cancelled_.Load() == 1 (0 
> vs. 1) 
> {code}
> Stack trace
> {code}
> Thread 368 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x43a1   rdi = 0x37e4
> rbp = 0x7efcd4c53080   rsp = 0x7efcd4c52d08
>  r8 = 0xr9 = 0x7efcd4c52b80
> r10 = 0x0008   r11 = 0x0206
> r12 = 0x093de7c0   r13 = 0x0086
> r14 = 0x093de7c4   r15 = 0x093d6de0
> rip = 0x7f05c9d231f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7efcd4c53250   rsp = 0x7efcd4c53090
> rip = 0x05727e3b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53130   r12 = 0x0fe01a982628
> r13 = 0x61d000da0a6c   r14 = 0x7efcd4c53250
> r15 = 0x7efcd4c53270   rip = 0x0572ba39
> Found by: call frame info
>  3  impalad!impala::QueryState::MonitorFInstances() [query-state.cc : 803 + 
> 0x45]
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 

[jira] [Updated] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10054:
--
Labels: broken-build flaky  (was: broken-build)

> test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
> ---
>
> Key: IMPALA-10054
> URL: https://issues.apache.org/jira/browse/IMPALA-10054
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, flaky
> Fix For: Impala 4.0
>
>
> test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
> flaky.
> Jenkins job that triggered the error:
> https://jenkins.impala.io/job/parallel-all-tests-nightly/1173
> Failing job:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/
> {code}
> Stacktrace
> query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
> assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
> E   assert ('SpilledRuns: ' + '3') in 'Query 
> (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil... 27.999ms\n   
>   - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
> WriteIoWaitTime: 0.000ns\n' = 
>  0x7f51da77fb50>.runtime_profile
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Attila Jeges (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172412#comment-17172412
 ] 

Attila Jeges commented on IMPALA-10050:
---

[~rizaon] randomly assigning to you. Please feel free to reassign. 

> DCHECK was hit possibly while executing TestFailpoints::test_failpoints
> ---
>
> Key: IMPALA-10050
> URL: https://issues.apache.org/jira/browse/IMPALA-10050
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 4.0
>
>
> A DCHECK was hit during  ASAN core e2e tests. Time-frame suggests that it 
> happened while executing TestFailpoints::test_failpoints e2e test.
> {code}
> 10:56:38  TestFailpoints.test_failpoints[protocol: beeswax | table_format: 
> avro/snap/block | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
> location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
> alltypessmall a join alltypessmall b on a.id = b.id] 
> 10:56:38 failure/test_failpoints.py:128: in test_failpoints
> 10:56:38 self.execute_query(query, vector.get_value('exec_option'))
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
>  in wrapper
> 10:56:38 return function(*args, **kwargs)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
>  in execute_query
> 10:56:38 return self.__execute_query(self.client, query, query_options)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
>  in __execute_query
> 10:56:38 return impalad_client.execute(query, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 10:56:38 return self.__beeswax_client.execute(sql_stmt, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 10:56:38 handle = self.__execute_query(query_string.strip(), user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 10:56:38 self.wait_for_finished(handle)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 10:56:38 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 10:56:38 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 10:56:38 EQuery aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed
> 10:56:38 E   TransmitData() to 127.0.0.1:27002 failed: Network error: Client 
> connection negotiation failed: client connection to 127.0.0.1:27002: connect: 
> Connection refused (error 111)
> {code}
> Impalad log:
> {code}
> Log file created at: 2020/08/05 01:52:56
> Running on machine: 
> impala-ec2-centos74-r5-4xlarge-ondemand-017c.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 01:52:56.979769 17313 query-state.cc:803] 
> 3941a3d92a71e242:15c963f3] Check failed: is_cancelled_.Load() == 1 (0 
> vs. 1) 
> {code}
> Stack trace
> {code}
> Thread 368 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x43a1   rdi = 0x37e4
> rbp = 0x7efcd4c53080   rsp = 0x7efcd4c52d08
>  r8 = 0xr9 = 0x7efcd4c52b80
> r10 = 0x0008   r11 = 0x0206
> r12 = 0x093de7c0   r13 = 0x0086
> r14 = 0x093de7c4   r15 = 0x093d6de0
> rip = 0x7f05c9d231f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7efcd4c53250   rsp = 0x7efcd4c53090
> rip = 0x05727e3b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53130   r12 = 0x0fe01a982628
> r13 = 0x61d000da0a6c   r14 = 0x7efcd4c53250
> r15 = 0x7efcd4c53270   rip = 0x0572ba39
> Found by: call frame info
>  3  impalad!impala::QueryState::MonitorFInstances() [query-state.cc : 803 + 
> 0x45]
> rbx = 0x7efcd4c532a0   rbp = 

[jira] [Assigned] (IMPALA-10050) DCHECK was hit possibly while executing TestFailpoints::test_failpoints

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges reassigned IMPALA-10050:
-

Assignee: Riza Suminto

> DCHECK was hit possibly while executing TestFailpoints::test_failpoints
> ---
>
> Key: IMPALA-10050
> URL: https://issues.apache.org/jira/browse/IMPALA-10050
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 4.0
>
>
> A DCHECK was hit during  ASAN core e2e tests. Time-frame suggests that it 
> happened while executing TestFailpoints::test_failpoints e2e test.
> {code}
> 10:56:38  TestFailpoints.test_failpoints[protocol: beeswax | table_format: 
> avro/snap/block | exec_option: {'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | 
> location: PREPARE | action: MEM_LIMIT_EXCEEDED | query: select 1 from 
> alltypessmall a join alltypessmall b on a.id = b.id] 
> 10:56:38 failure/test_failpoints.py:128: in test_failpoints
> 10:56:38 self.execute_query(query, vector.get_value('exec_option'))
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:811:
>  in wrapper
> 10:56:38 return function(*args, **kwargs)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:843:
>  in execute_query
> 10:56:38 return self.__execute_query(self.client, query, query_options)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_test_suite.py:909:
>  in __execute_query
> 10:56:38 return impalad_client.execute(query, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/common/impala_connection.py:205:
>  in execute
> 10:56:38 return self.__beeswax_client.execute(sql_stmt, user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:187:
>  in execute
> 10:56:38 handle = self.__execute_query(query_string.strip(), user=user)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:365:
>  in __execute_query
> 10:56:38 self.wait_for_finished(handle)
> 10:56:38 
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/beeswax/impala_beeswax.py:386:
>  in wait_for_finished
> 10:56:38 raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> 10:56:38 E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> 10:56:38 EQuery aborted:RPC from 127.0.0.1:27000 to 127.0.0.1:27002 failed
> 10:56:38 E   TransmitData() to 127.0.0.1:27002 failed: Network error: Client 
> connection negotiation failed: client connection to 127.0.0.1:27002: connect: 
> Connection refused (error 111)
> {code}
> Impalad log:
> {code}
> Log file created at: 2020/08/05 01:52:56
> Running on machine: 
> impala-ec2-centos74-r5-4xlarge-ondemand-017c.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 01:52:56.979769 17313 query-state.cc:803] 
> 3941a3d92a71e242:15c963f3] Check failed: is_cancelled_.Load() == 1 (0 
> vs. 1) 
> {code}
> Stack trace
> {code}
> Thread 368 (crashed)
>  0  libc-2.17.so + 0x351f7
> rax = 0x   rdx = 0x0006
> rcx = 0x   rbx = 0x0004
> rsi = 0x43a1   rdi = 0x37e4
> rbp = 0x7efcd4c53080   rsp = 0x7efcd4c52d08
>  r8 = 0xr9 = 0x7efcd4c52b80
> r10 = 0x0008   r11 = 0x0206
> r12 = 0x093de7c0   r13 = 0x0086
> r14 = 0x093de7c4   r15 = 0x093d6de0
> rip = 0x7f05c9d231f7
> Found by: given as instruction pointer in context
>  1  impalad!google::LogMessage::Flush() + 0x1eb
> rbp = 0x7efcd4c53250   rsp = 0x7efcd4c53090
> rip = 0x05727e3b
> Found by: previous frame's frame pointer
>  2  impalad!google::LogMessageFatal::~LogMessageFatal() + 0x9
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53130   r12 = 0x0fe01a982628
> r13 = 0x61d000da0a6c   r14 = 0x7efcd4c53250
> r15 = 0x7efcd4c53270   rip = 0x0572ba39
> Found by: call frame info
>  3  impalad!impala::QueryState::MonitorFInstances() [query-state.cc : 803 + 
> 0x45]
> rbx = 0x7efcd4c532a0   rbp = 0x7efcd4c53310
> rsp = 0x7efcd4c53140   r12 = 0x0fe01a982628
> 

[jira] [Updated] (IMPALA-10055) DCHECK was hit while executing e2e test TestQueries::test_subquery

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10055:
--
Labels: broken-build crash  (was: )

> DCHECK was hit while executing e2e test TestQueries::test_subquery
> --
>
> Key: IMPALA-10055
> URL: https://issues.apache.org/jira/browse/IMPALA-10055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 4.0
>
>
> A DCHECK was hit while executing e2e test. Time frame suggests that it 
> possibly happened while executing TestQueries::test_subquery:
> {code}
> query_test/test_queries.py:149: in test_subquery
> self.run_test_case('QueryTest/subquery', vector)
> common/impala_test_suite.py:662: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:600: in __exec_in_impala
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:909: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:334: in execute
> r = self.__fetch_results(handle, profile_format=profile_format)
> common/impala_connection.py:436: in __fetch_results
> result_tuples = cursor.fetchall()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:532:
>  in fetchall
> self._wait_to_finish()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:405:
>  in _wait_to_finish
> resp = self._last_operation._rpc('GetOperationStatus', req)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:992:
>  in _rpc
> response = self._execute(func_name, request)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:1023:
>  in _execute
> .format(self.retries))
> E   HiveServer2Error: Failed after retrying 3 times
> {code}
> impalad log:
> {code}
> Log file created at: 2020/08/05 17:34:30
> Running on machine: 
> impala-ec2-centos74-m5-4xlarge-ondemand-18a5.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 17:34:30.003247 10887 orc-column-readers.cc:423] 
> c34e87376f496a53:7ba6a2e40002] Check failed: (scanner_->row_batches_nee
> d_validation_ && scanner_->scan_node_->IsZeroSlotTableScan()) || 
> scanner_->acid_original_file
> {code}
> Stack trace:
> {code}
> CORE: ./fe/core.1596674070.14179.impalad
> BINARY: ./be/build/latest/service/impalad
> Core was generated by 
> `/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> To enable execution of this file add
>   add-auto-load-safe-path 
> /data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib64/libstdc++.so.6.0.24-gdb.py
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> To completely disable this security protection add
>   set auto-load safe-path /
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
>   info "(gdb)Auto-loading safe path"
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> #1  0x7efd6ec6f8e8 in abort () from /lib64/libc.so.6
> #2  0x086b8ea4 in google::DumpStackTraceAndExit() ()
> #3  0x086ae25d in google::LogMessage::Fail() ()
> #4  0x086afb4d in google::LogMessage::SendToLog() ()
> #5  0x086adbbb in google::LogMessage::Flush() ()
> #6  0x086b17b9 in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x0388e10a in impala::OrcStructReader::TopLevelReadValueBatch 
> (this=0x61162630, scratch_batch=0x824831e0, pool=0x82483258) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/orc-column-readers.cc:421
> #8  0x03810c92 in impala::HdfsOrcScanner::TransferTuples 
> (this=0x27143c00, dst_batch=0x2e5ca820) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:808
> #9  0x03814e2a in impala::HdfsOrcScanner::AssembleRows 
> (this=0x27143c00, row_batch=0x2e5ca820) at 
> 

[jira] [Updated] (IMPALA-10055) DCHECK was hit while executing e2e test TestQueries::test_subquery

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10055:
--
Labels: broken-build crash flaky  (was: broken-build crash)

> DCHECK was hit while executing e2e test TestQueries::test_subquery
> --
>
> Key: IMPALA-10055
> URL: https://issues.apache.org/jira/browse/IMPALA-10055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build, crash, flaky
> Fix For: Impala 4.0
>
>
> A DCHECK was hit while executing e2e test. Time frame suggests that it 
> possibly happened while executing TestQueries::test_subquery:
> {code}
> query_test/test_queries.py:149: in test_subquery
> self.run_test_case('QueryTest/subquery', vector)
> common/impala_test_suite.py:662: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:600: in __exec_in_impala
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:909: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:334: in execute
> r = self.__fetch_results(handle, profile_format=profile_format)
> common/impala_connection.py:436: in __fetch_results
> result_tuples = cursor.fetchall()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:532:
>  in fetchall
> self._wait_to_finish()
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:405:
>  in _wait_to_finish
> resp = self._last_operation._rpc('GetOperationStatus', req)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:992:
>  in _rpc
> response = self._execute(func_name, request)
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:1023:
>  in _execute
> .format(self.retries))
> E   HiveServer2Error: Failed after retrying 3 times
> {code}
> impalad log:
> {code}
> Log file created at: 2020/08/05 17:34:30
> Running on machine: 
> impala-ec2-centos74-m5-4xlarge-ondemand-18a5.vpc.cloudera.com
> Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
> F0805 17:34:30.003247 10887 orc-column-readers.cc:423] 
> c34e87376f496a53:7ba6a2e40002] Check failed: (scanner_->row_batches_nee
> d_validation_ && scanner_->scan_node_->IsZeroSlotTableScan()) || 
> scanner_->acid_original_file
> {code}
> Stack trace:
> {code}
> CORE: ./fe/core.1596674070.14179.impalad
> BINARY: ./be/build/latest/service/impalad
> Core was generated by 
> `/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
> Program terminated with signal SIGABRT, Aborted.
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> To enable execution of this file add
>   add-auto-load-safe-path 
> /data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib64/libstdc++.so.6.0.24-gdb.py
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> To completely disable this security protection add
>   set auto-load safe-path /
> line to your configuration file "/var/lib/jenkins/.gdbinit".
> For more information about this security protection see the
> "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
>   info "(gdb)Auto-loading safe path"
> #0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
> #1  0x7efd6ec6f8e8 in abort () from /lib64/libc.so.6
> #2  0x086b8ea4 in google::DumpStackTraceAndExit() ()
> #3  0x086ae25d in google::LogMessage::Fail() ()
> #4  0x086afb4d in google::LogMessage::SendToLog() ()
> #5  0x086adbbb in google::LogMessage::Flush() ()
> #6  0x086b17b9 in google::LogMessageFatal::~LogMessageFatal() ()
> #7  0x0388e10a in impala::OrcStructReader::TopLevelReadValueBatch 
> (this=0x61162630, scratch_batch=0x824831e0, pool=0x82483258) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/orc-column-readers.cc:421
> #8  0x03810c92 in impala::HdfsOrcScanner::TransferTuples 
> (this=0x27143c00, dst_batch=0x2e5ca820) at 
> /data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:808
> #9  0x03814e2a in impala::HdfsOrcScanner::AssembleRows 
> (this=0x27143c00, 

[jira] [Created] (IMPALA-10055) DCHECK was hit while executing e2e test TestQueries::test_subquery

2020-08-06 Thread Attila Jeges (Jira)
Attila Jeges created IMPALA-10055:
-

 Summary: DCHECK was hit while executing e2e test 
TestQueries::test_subquery
 Key: IMPALA-10055
 URL: https://issues.apache.org/jira/browse/IMPALA-10055
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Attila Jeges
Assignee: Zoltán Borók-Nagy
 Fix For: Impala 4.0


A DCHECK was hit while executing e2e test. Time frame suggests that it possibly 
happened while executing TestQueries::test_subquery:

{code}
query_test/test_queries.py:149: in test_subquery
self.run_test_case('QueryTest/subquery', vector)
common/impala_test_suite.py:662: in run_test_case
result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
common/impala_test_suite.py:600: in __exec_in_impala
result = self.__execute_query(target_impalad_client, query, user=user)
common/impala_test_suite.py:909: in __execute_query
return impalad_client.execute(query, user=user)
common/impala_connection.py:334: in execute
r = self.__fetch_results(handle, profile_format=profile_format)
common/impala_connection.py:436: in __fetch_results
result_tuples = cursor.fetchall()
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:532:
 in fetchall
self._wait_to_finish()
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:405:
 in _wait_to_finish
resp = self._last_operation._rpc('GetOperationStatus', req)
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:992:
 in _rpc
response = self._execute(func_name, request)
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/infra/python/env-gcc7.5.0/lib/python2.7/site-packages/impala/hiveserver2.py:1023:
 in _execute
.format(self.retries))
E   HiveServer2Error: Failed after retrying 3 times
{code}

impalad log:
{code}
Log file created at: 2020/08/05 17:34:30
Running on machine: 
impala-ec2-centos74-m5-4xlarge-ondemand-18a5.vpc.cloudera.com
Log line format: [IWEF]mmdd hh:mm:ss.uu threadid file:line] msg
F0805 17:34:30.003247 10887 orc-column-readers.cc:423] 
c34e87376f496a53:7ba6a2e40002] Check failed: (scanner_->row_batches_nee
d_validation_ && scanner_->scan_node_->IsZeroSlotTableScan()) || 
scanner_->acid_original_file
{code}

Stack trace:
{code}
CORE: ./fe/core.1596674070.14179.impalad
BINARY: ./be/build/latest/service/impalad
Core was generated by 
`/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/build/lat'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
To enable execution of this file add
add-auto-load-safe-path 
/data0/jenkins/workspace/impala-cdpd-master-core-ubsan/Impala-Toolchain/toolchain-packages-gcc7.5.0/gcc-7.5.0/lib64/libstdc++.so.6.0.24-gdb.py
line to your configuration file "/var/lib/jenkins/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/var/lib/jenkins/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
info "(gdb)Auto-loading safe path"
#0  0x7efd6ec6e1f7 in raise () from /lib64/libc.so.6
#1  0x7efd6ec6f8e8 in abort () from /lib64/libc.so.6
#2  0x086b8ea4 in google::DumpStackTraceAndExit() ()
#3  0x086ae25d in google::LogMessage::Fail() ()
#4  0x086afb4d in google::LogMessage::SendToLog() ()
#5  0x086adbbb in google::LogMessage::Flush() ()
#6  0x086b17b9 in google::LogMessageFatal::~LogMessageFatal() ()
#7  0x0388e10a in impala::OrcStructReader::TopLevelReadValueBatch 
(this=0x61162630, scratch_batch=0x824831e0, pool=0x82483258) at 
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/orc-column-readers.cc:421
#8  0x03810c92 in impala::HdfsOrcScanner::TransferTuples 
(this=0x27143c00, dst_batch=0x2e5ca820) at 
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:808
#9  0x03814e2a in impala::HdfsOrcScanner::AssembleRows 
(this=0x27143c00, row_batch=0x2e5ca820) at 
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:785
#10 0x0380f5fe in impala::HdfsOrcScanner::GetNextInternal 
(this=0x27143c00, row_batch=0x2e5ca820) at 
/data/jenkins/workspace/impala-cdpd-master-core-ubsan/repos/Impala/be/src/exec/hdfs-orc-scanner.cc:654
#11 0x0380c2bb in impala::HdfsOrcScanner::ProcessSplit 
(this=0x27143c00) at 

[jira] [Resolved] (IMPALA-9969) TestParquetStats.test_page_index seems flaky

2020-08-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-9969.
---
Resolution: Cannot Reproduce

Closing this since it was due to machine slowness. And it didn't occur in a 
while.

> TestParquetStats.test_page_index seems flaky
> 
>
> Key: IMPALA-9969
> URL: https://issues.apache.org/jira/browse/IMPALA-9969
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Fang-Yu Rao
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: flaky
>
> [TestParquetStats.test_page_index|https://github.com/apache/impala/blob/master/tests/query_test/test_parquet_stats.py#L77-L99]
>  timed out in a recent build. Since [~boroknagyz] authored this function 
> (IMPALA-5843), maybe [~boroknagyz] could offer some insight into it? Thanks!
> In what follows the error message is also provided.
> {code:java}
> query_test/test_parquet_stats.py:91: in test_page_index unique_database) 
> common/impala_test_suite.py:662: in run_test_case result = exec_fn(query, 
> user=test_section.get('USER', '').strip() or None) 
> common/impala_test_suite.py:600: in __exec_in_impala result = 
> self.__execute_query(target_impalad_client, query, user=user) 
> common/impala_test_suite.py:909: in __execute_query return 
> impalad_client.execute(query, user=user) common/impala_connection.py:205: in 
> execute return self.__beeswax_client.execute(sql_stmt, user=user) 
> beeswax/impala_beeswax.py:187: in execute handle = 
> self.__execute_query(query_string.strip(), user=user) 
> beeswax/impala_beeswax.py:365: in __execute_query 
> self.wait_for_finished(handle) beeswax/impala_beeswax.py:389: in 
> wait_for_finished time.sleep(0.05) E   Failed: Timeout >14400s
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Attila Jeges (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Jeges updated IMPALA-10054:
--
Labels: broken-build  (was: )

> test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly
> ---
>
> Key: IMPALA-10054
> URL: https://issues.apache.org/jira/browse/IMPALA-10054
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 4.0
>Reporter: Attila Jeges
>Assignee: Riza Suminto
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 4.0
>
>
> test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
> flaky.
> Jenkins job that triggered the error:
> https://jenkins.impala.io/job/parallel-all-tests-nightly/1173
> Failing job:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/
> {code}
> Stacktrace
> query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
> assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
> E   assert ('SpilledRuns: ' + '3') in 'Query 
> (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
> 0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 
> 0.000ns\n'
> E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
> WARNING: Query profile created while running a DEBUG buil... 27.999ms\n   
>   - WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
> WriteIoWaitTime: 0.000ns\n' = 
>  0x7f51da77fb50>.runtime_profile
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10054) test_multiple_sort_run_bytes_limits fails in parallel-all-tests-nightly

2020-08-06 Thread Attila Jeges (Jira)
Attila Jeges created IMPALA-10054:
-

 Summary: test_multiple_sort_run_bytes_limits fails in 
parallel-all-tests-nightly
 Key: IMPALA-10054
 URL: https://issues.apache.org/jira/browse/IMPALA-10054
 Project: IMPALA
  Issue Type: Improvement
Affects Versions: Impala 4.0
Reporter: Attila Jeges
Assignee: Riza Suminto
 Fix For: Impala 4.0


test_multiple_sort_run_bytes_limits  introduced in IMPALA-6692 seems to be 
flaky.

Jenkins job that triggered the error:
https://jenkins.impala.io/job/parallel-all-tests-nightly/1173

Failing job:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/2899/testReport/

{code}
Stacktrace
query_test/test_sort.py:89: in test_multiple_sort_run_bytes_limits
assert "SpilledRuns: " + spilled_runs in query_result.runtime_profile
E   assert ('SpilledRuns: ' + '3') in 'Query 
(id=404da0b1e56e7248:120789cd):\n  DEBUG MODE WARNING: Query profile 
created while running a DEBUG buil... 27.999ms\n - WriteIoBytes: 
0\n - WriteIoOps: 0 (0)\n - WriteIoWaitTime: 0.000ns\n'
E+  where 'Query (id=404da0b1e56e7248:120789cd):\n  DEBUG MODE 
WARNING: Query profile created while running a DEBUG buil... 27.999ms\n 
- WriteIoBytes: 0\n - WriteIoOps: 0 (0)\n - 
WriteIoWaitTime: 0.000ns\n' = .runtime_profile
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10051) impala-shell exits with ValueError with WITH clauses

2020-08-06 Thread Tamas Mate (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172327#comment-17172327
 ] 

Tamas Mate edited comment on IMPALA-10051 at 8/6/20, 12:44 PM:
---

In the ImpalaShell.do_with method shlex.split is used to tokenize the query so 
that it can be decided whether the query is DML or not.

However, whitespace bound tokens in shlex by default. Therefore, whitespace in 
a query string can cause the closure of a token prematurely before the final 
quote could be evaluated. Which triggers the 'No closing quotation' exception.

In this example the space after the + sign causes the issue.
{code:java}
impala-shell.sh -q 'with select regexp_replace(column_name,"[a-zA-Z]","+ ");' 
{code}


was (Author: tmate):
In the ImpalaShell.do_with method shlex.split is used to tokenize the query so 
that it can be decided whether the query is DML or not.

However, whitespace bound tokens in shlex by default. Therefore, whitespace in 
a query string can cause the closure of a token prematurely before the final 
quote could be evaluated. Which triggers the 'No closing quotation' exception.

In this example the space after the + sign causes the issue.
impala-shell.sh -q 'with select regexp_replace(column_name,"[a-zA-Z]","+ ");'
 

> impala-shell exits with ValueError with WITH clauses
> 
>
> Key: IMPALA-10051
> URL: https://issues.apache.org/jira/browse/IMPALA-10051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> 
> impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
> if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
> shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
> if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
> return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
> tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
> return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
> token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
> raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10051) impala-shell exits with ValueError with WITH clauses

2020-08-06 Thread Tamas Mate (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172327#comment-17172327
 ] 

Tamas Mate commented on IMPALA-10051:
-

In the ImpalaShell.do_with method shlex.split is used to tokenize the query so 
that it can be decided whether the query is DML or not.

However, whitespace bound tokens in shlex by default. Therefore, whitespace in 
a query string can cause the closure of a token prematurely before the final 
quote could be evaluated. Which triggers the 'No closing quotation' exception.

In this example the space after the + sign causes the issue.
impala-shell.sh -q 'with select regexp_replace(column_name,"[a-zA-Z]","+ ");'
 

> impala-shell exits with ValueError with WITH clauses
> 
>
> Key: IMPALA-10051
> URL: https://issues.apache.org/jira/browse/IMPALA-10051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> 
> impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
> if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
> shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
> if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
> return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
> tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
> return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
> token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
> raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org