[jira] [Commented] (IMPALA-8849) IllegalStateException in HashJoinNode because of missing memory estimate

2019-08-15 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908144#comment-16908144
 ] 

ASF subversion and git services commented on IMPALA-8849:
-

Commit c3a67b67faaebf18735fa36d35c80d8c11043f7f in impala's branch 
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=c3a67b6 ]

IMPALA-8849: fix IllegalStateException with VARCHAR

The bug is that the serialized size wasn't populated
for VARCHAR in a case when it should have been.
It appears a condition was simply not updated when
VARCHAR was added.

Other code assumed that the serialized size was
populated when the other size field was populated,
which is a reasonable invariant. I documented the
invariant in the class and added validation that
the invariant held.

Defining and checking invariants led to discovering
various other minor issues where the sizes were
set incorrect for fixed-length types or not set for
variable-length types:
* CHAR was not consistently treated as a fixed-length type.
* avgSerializedSize_ was not always updated with avgSize_

Testing:
Added a regression test for this specific case. Adding
the assertions resulted in other cases showing up
related bugs.

Change-Id: Ie45e386cb09e31f4b7cdc82b7734dbecb4464534
Reviewed-on: http://gerrit.cloudera.org:8080/14062
Tested-by: Impala Public Jenkins 
Reviewed-by: Csaba Ringhofer 


> IllegalStateException in HashJoinNode because of missing memory estimate
> 
>
> Key: IMPALA-8849
> URL: https://issues.apache.org/jira/browse/IMPALA-8849
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
> Fix For: Impala 3.3.0
>
> Attachments: query4.sql
>
>
> [~grahn] reports the below error when running the attached TPC-DS query on 
> uncompressed text with stats.
> {noformat}
> I0808 03:27:16.621085 150090 jni-util.cc:288] 
> 8945e9f76f33dca0:8bf31c47] java.lang.IllegalStateException: Mem 
> estimate must be set
> at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> at 
> org.apache.impala.planner.ResourceProfileBuilder.build(ResourceProfileBuilder.java:73)
> at 
> org.apache.impala.planner.HashJoinNode.computeNodeResourceProfile(HashJoinNode.java:252)
> at 
> org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:238)
> at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:416)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1162)
> at 
> org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1477)
> at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1341)
> at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1236)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1206)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154)
> I0808 03:27:16.621861 150090 status.cc:124] 
> 8945e9f76f33dca0:8bf31c47] IllegalStateException: Mem estimate must 
> be set
> @   0xb9af99
> @  0x11f785e
> @  0x10ab9a3
> @  0x10d0f54
> @  0x10e2a2c
> @  0x1125f8d
> @  0x1458c54
> @  0x145810c
> @   0xb68769
> @   0xf91ca0
> @   0xf8791e
> @   0xf887b1
> @  0x127cd2f
> @  0x127d8d9
> @  0x1ac0709
> @ 0x7f11ea4d7dd4
> @ 0x7f11e6f1e02c
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8849) IllegalStateException in HashJoinNode because of missing memory estimate

2019-08-08 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903356#comment-16903356
 ] 

Tim Armstrong commented on IMPALA-8849:
---

Reverting this commit fixed the issue for me

commit c2516d220da8e532b6ebdb6f3a12e7ad97c4f597
Author: Csaba Ringhofer 
Date:   Wed Apr 17 15:01:07 2019 +0200

IMPALA-8409: Fix row-size for STRING columns with unknown stats

Explain returned row-size=11B for STRING columns without statistics.
The issue was caused by adding -1 (meaning unknown) to the 12 byte
slot size (sizeof(StringValue)). The code in TupleDescriptor.java
tried to handle this by checking if the size is -1, but it was
already 11 at this point.

There is more potential for cleanup, but I wanted to keep this
change minimal.

Testing:
- revived some tests in CatalogTest.java that were removed
  in 2013 due to flakiness
- added an EE test that checks row size with and without stats
- fixed a similar test, test_explain_validate_cardinality_estimates
  (the format of the line it looks for has changed, which lead to
  skipping the actual verification and accepting everything)
- ran core FE and EE tests

Change-Id: I866acf10b2c011a735dee019f4bc29358f2ec4e5
Reviewed-on: http://gerrit.cloudera.org:8080/13190
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> IllegalStateException in HashJoinNode because of missing memory estimate
> 
>
> Key: IMPALA-8849
> URL: https://issues.apache.org/jira/browse/IMPALA-8849
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
> Attachments: query4.sql
>
>
> [~grahn] reports the below error when running the attached TPC-DS query on 
> uncompressed text with stats.
> {noformat}
> I0808 03:27:16.621085 150090 jni-util.cc:288] 
> 8945e9f76f33dca0:8bf31c47] java.lang.IllegalStateException: Mem 
> estimate must be set
> at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> at 
> org.apache.impala.planner.ResourceProfileBuilder.build(ResourceProfileBuilder.java:73)
> at 
> org.apache.impala.planner.HashJoinNode.computeNodeResourceProfile(HashJoinNode.java:252)
> at 
> org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:238)
> at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:416)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1162)
> at 
> org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1477)
> at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1341)
> at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1236)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1206)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154)
> I0808 03:27:16.621861 150090 status.cc:124] 
> 8945e9f76f33dca0:8bf31c47] IllegalStateException: Mem estimate must 
> be set
> @   0xb9af99
> @  0x11f785e
> @  0x10ab9a3
> @  0x10d0f54
> @  0x10e2a2c
> @  0x1125f8d
> @  0x1458c54
> @  0x145810c
> @   0xb68769
> @   0xf91ca0
> @   0xf8791e
> @   0xf887b1
> @  0x127cd2f
> @  0x127d8d9
> @  0x1ac0709
> @ 0x7f11ea4d7dd4
> @ 0x7f11e6f1e02c
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8849) IllegalStateException in HashJoinNode because of missing memory estimate

2019-08-08 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903351#comment-16903351
 ] 

Tim Armstrong commented on IMPALA-8849:
---

So it looks like it only happens when stats are computed and the column is a 
VARCHAR.
{noformat}
[localhost:21000] tpcds> use tpcds_1_text;
Query: use tpcds_1_text
[localhost:21000] tpcds_1_text> explain select c_customer_id from 
tpcds_1_text.customer;
Query: explain select c_customer_id from tpcds_1_text.customer
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3 |
| Per-Host Resource Estimates: Memory=48MB   |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  |
| 00:SCAN HDFS [tpcds_1_text.customer]   |
|HDFS partitions=1/1 files=1 size=12.50MB|
|row-size=-1B cardinality=100.00K|
++
{noformat}

Here's a repro in my dev env:
{noformat}
[localhost:21000] tpcds> create external table repro_cust(c_customer_sk INT, 
c_customer_id VARCHAR(16)) row format delimited fields terminated by '|' WITH 
SERDEPROPERTIES ('field.delim'='|', 'serialization.format'='|')  STORED AS 
TEXTFILE LOCATION 'hdfs://localhost:20500/test-warehouse/tpcds.customer';
Query: create external table repro_cust(c_customer_sk INT, c_customer_id 
VARCHAR(16)) row format delimited fields terminated by '|' WITH SERDEPROPERTIES 
('field.delim'='|', 'serialization.format'='|')  STORED AS TEXTFILE LOCATION 
'hdfs://localhost:20500/test-warehouse/tpcds.customer'
+-+
| summary |
+-+
| Table has been created. |
+-+
Fetched 1 row(s) in 0.39s
[localhost:21000] tpcds> compute stats repro_cust;
Query: compute stats repro_cust
+-+
| summary |
+-+
| Updated 1 partition(s) and 2 column(s). |
+-+
Fetched 1 row(s) in 3.62s
[localhost:21000] tpcds> explain select c_customer_id from repro_cust;
Query: explain select c_customer_id from repro_cust
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3 |
| Per-Host Resource Estimates: Memory=48MB   |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  |
| 00:SCAN HDFS [tpcds.repro_cust]|
|HDFS partitions=1/1 files=1 size=12.60MB|
|row-size=-1B cardinality=100.00K|
++
Fetched 10 row(s) in 0.01s
{noformat}

> IllegalStateException in HashJoinNode because of missing memory estimate
> 
>
> Key: IMPALA-8849
> URL: https://issues.apache.org/jira/browse/IMPALA-8849
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
> Attachments: query4.sql
>
>
> [~grahn] reports the below error when running the attached TPC-DS query on 
> uncompressed text with stats.
> {noformat}
> I0808 03:27:16.621085 150090 jni-util.cc:288] 
> 8945e9f76f33dca0:8bf31c47] java.lang.IllegalStateException: Mem 
> estimate must be set
> at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> at 
> org.apache.impala.planner.ResourceProfileBuilder.build(ResourceProfileBuilder.java:73)
> at 
> org.apache.impala.planner.HashJoinNode.computeNodeResourceProfile(HashJoinNode.java:252)
> at 
> org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:238)
> at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:416)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1162)
> at 

[jira] [Commented] (IMPALA-8849) IllegalStateException in HashJoinNode because of missing memory estimate

2019-08-08 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903342#comment-16903342
 ] 

Tim Armstrong commented on IMPALA-8849:
---

{noformat}
[nightly6x-unsecure-1.nightly6x-unsecure.root.hwx.site:21000] default> describe 
tpcds.customer;
Query: describe tpcds.customer
++-+-+
| name   | type| comment |
++-+-+
| c_customer_sk  | int | |
| c_customer_id  | varchar(16) | |
| c_current_cdemo_sk | int | |
| c_current_hdemo_sk | int | |
| c_current_addr_sk  | int | |
| c_first_shipto_date_sk | int | |
| c_first_sales_date_sk  | int | |
| c_salutation   | varchar(10) | |
| c_first_name   | varchar(20) | |
| c_last_name| varchar(30) | |
| c_preferred_cust_flag  | varchar(1)  | |
| c_birth_day| int | |
| c_birth_month  | int | |
| c_birth_year   | int | |
| c_birth_country| varchar(20) | |
| c_login| varchar(13) | |
| c_email_address| varchar(50) | |
| c_last_review_date_sk  | int | |
++-+-+
Fetched 18 row(s) in 0.01s
[nightly6x-unsecure-1.nightly6x-unsecure.root.hwx.site:21000] default> explain 
select c_customer_id from tpcds.customer;
Query: explain select c_customer_id from tpcds.customer
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3 |
| Per-Host Resource Estimates: Memory=48MB   |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  |
| 00:SCAN HDFS [tpcds.customer]  |
|HDFS partitions=1/1 files=1 size=12.50MB|
|row-size=-1B cardinality=100.00K|
++
Fetched 10 row(s) in 0.02s
[nightly6x-unsecure-1.nightly6x-unsecure.root.hwx.site:21000] default> show 
column stats tpcds.customer;
Query: show column stats tpcds.customer
++-+--++--+---+
| Column | Type| #Distinct Values | #Nulls | Max Size | 
Avg Size  |
++-+--++--+---+
| c_customer_sk  | INT | 10   | 0  | 4| 
4 |
| c_customer_id  | VARCHAR(16) | 10   | 0  | 16   | 
16|
| c_current_cdemo_sk | INT | 91558| 3438   | 4| 
4 |
| c_current_hdemo_sk | INT | 7376 | 3431   | 4| 
4 |
| c_current_addr_sk  | INT | 42003| 0  | 4| 
4 |
| c_first_shipto_date_sk | INT | 3754 | 3443   | 4| 
4 |
| c_first_sales_date_sk  | INT | 3734 | 3518   | 4| 
4 |
| c_salutation   | VARCHAR(10) | 6| 3410   | 4| 
3.24143433228 |
| c_first_name   | VARCHAR(20) | 4013 | 3492   | 11   | 
5.839499950408936 |
| c_last_name| VARCHAR(30) | 4951 | 3497   | 13   | 
6.124800205230713 |
| c_preferred_cust_flag  | VARCHAR(1)  | 2| 3426   | 1| 
1 |
| c_birth_day| INT | 31   | 3461   | 4| 
4 |
| c_birth_month  | INT | 12   | 3449   | 4| 
4 |
| c_birth_year   | INT | 67   | 3453   | 4| 
4 |
| c_birth_country| VARCHAR(20) | 207  | 3439   | 20   | 
8.708800315856934 |
| c_login| VARCHAR(13) | 0| 10 | 0| 
0 |
| c_email_address| VARCHAR(50) | 10   | 3521   | 46   | 
27.45219993591309 |
| c_last_review_date_sk  | INT | 350  | 3484   | 4| 
4 |
++-+--++--+---+
{noformat}


[jira] [Commented] (IMPALA-8849) IllegalStateException in HashJoinNode because of missing memory estimate

2019-08-08 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16903290#comment-16903290
 ] 

Tim Armstrong commented on IMPALA-8849:
---

Looks like HdfsScanNode can produce a negative row size estimate, which would 
explain the negative memory estimate.
{noformat}
I0808 20:25:37.342036 110162 DistributedPlanner.java:498] 
114bef3456667eb2:458377fe] 36:SCAN HDFS [tpcds.customer
, RANDOM]
   HDFS partitions=1/1 files=1 size=12.50MB
   stored statistics:
 table: rows=100.00K size=12.50MB
 columns: all
   extrapolated-rows=disabled max-scan-range-rows=100.00K
   mem-estimate=invalid mem-reservation=invalid thread-reservation=invalid
   tuple-ids=78 row-size=-3B cardinality=100.00K
   in pipelines: 
{noformat}

> IllegalStateException in HashJoinNode because of missing memory estimate
> 
>
> Key: IMPALA-8849
> URL: https://issues.apache.org/jira/browse/IMPALA-8849
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Blocker
> Attachments: query4.sql
>
>
> [~grahn] reports the below error when running the attached TPC-DS query on 
> uncompressed text with stats.
> {noformat}
> I0808 03:27:16.621085 150090 jni-util.cc:288] 
> 8945e9f76f33dca0:8bf31c47] java.lang.IllegalStateException: Mem 
> estimate must be set
> at com.google.common.base.Preconditions.checkState(Preconditions.java:149)
> at 
> org.apache.impala.planner.ResourceProfileBuilder.build(ResourceProfileBuilder.java:73)
> at 
> org.apache.impala.planner.HashJoinNode.computeNodeResourceProfile(HashJoinNode.java:252)
> at 
> org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:238)
> at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:416)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1162)
> at 
> org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:1477)
> at 
> org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1341)
> at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1236)
> at 
> org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1206)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:154)
> I0808 03:27:16.621861 150090 status.cc:124] 
> 8945e9f76f33dca0:8bf31c47] IllegalStateException: Mem estimate must 
> be set
> @   0xb9af99
> @  0x11f785e
> @  0x10ab9a3
> @  0x10d0f54
> @  0x10e2a2c
> @  0x1125f8d
> @  0x1458c54
> @  0x145810c
> @   0xb68769
> @   0xf91ca0
> @   0xf8791e
> @   0xf887b1
> @  0x127cd2f
> @  0x127d8d9
> @  0x1ac0709
> @ 0x7f11ea4d7dd4
> @ 0x7f11e6f1e02c
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org