[jira] [Resolved] (IMPALA-9351) AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path

2020-09-23 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-9351.

 Fix Version/s: Impala 4.0
Target Version: Impala 3.4.0, Impala 4.0  (was: Impala 3.4.0)
Resolution: Fixed

Closing for now. We can re-open if the issue occurs again.

> AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to non-existing path
> -
>
> Key: IMPALA-9351
> URL: https://issues.apache.org/jira/browse/IMPALA-9351
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Fang-Yu Rao
>Assignee: Quanlong Huang
>Priority: Blocker
>  Labels: broken-build, flaky-test
> Fix For: Impala 4.0, Impala 3.4.0
>
>
> AnalyzeDDLTest.TestCreateTableLikeFileOrc failed due to a non-existing path. 
> Specifically, we see the following error message.
> {code:java}
> Error Message
> Error during analysis:
> org.apache.impala.common.AnalysisException: Cannot infer schema, path does 
> not exist: 
> hdfs://localhost:20500/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0
> sql:
> create table if not exists newtbl_DNE like orc 
> '/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0'
> {code}
> The stack trace is provided in the following.
> {code:java}
> Stacktrace
> java.lang.AssertionError: 
> Error during analysis:
> org.apache.impala.common.AnalysisException: Cannot infer schema, path does 
> not exist: 
> hdfs://localhost:20500/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0
> sql:
> create table if not exists newtbl_DNE like orc 
> '/test-warehouse/functional_orc_def.db/complextypes_fileformat/00_0'
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.impala.common.FrontendFixture.analyzeStmt(FrontendFixture.java:397)
>   at 
> org.apache.impala.common.FrontendTestBase.AnalyzesOk(FrontendTestBase.java:244)
>   at 
> org.apache.impala.common.FrontendTestBase.AnalyzesOk(FrontendTestBase.java:185)
>   at 
> org.apache.impala.analysis.AnalyzeDDLTest.TestCreateTableLikeFileOrc(AnalyzeDDLTest.java:2045)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:236)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:386)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:323)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:143)
> {code}
> This test was recently added by [~norbertluksa], and [~boroknagyz] gave a +2, 
> maybe [~boroknagyz] could provide some insight into this? Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMPALA-10186) Write invalid PageLocations into sort by parquet table

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Summary: Write invalid PageLocations into sort by parquet table  (was: 
Writing invalid PageLocations into sort by parquet table)

> Write invalid PageLocations into sort by parquet table
> --
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10186) Writing invalid PageLocations on table

2020-09-23 Thread guojingfeng (Jira)
guojingfeng created IMPALA-10186:


 Summary: Writing invalid PageLocations on table
 Key: IMPALA-10186
 URL: https://issues.apache.org/jira/browse/IMPALA-10186
 Project: IMPALA
  Issue Type: Bug
Reporter: guojingfeng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Writing invalid PageLocations into sort by parquet table

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Summary: Writing invalid PageLocations into sort by parquet table  (was: 
Writing invalid PageLocations on table)

> Writing invalid PageLocations into sort by parquet table
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Summary: Write invalid parquet PageLocations which table sort by some 
columns  (was: Write invalid PageLocations into sort by parquet table)

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page.

!image-2020-09-23-17-53-23-696.png|width=1475,height=416!

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
> Attachments: image-2020-09-23-17-53-23-696.png
>
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page.
> !image-2020-09-23-17-53-23-696.png|width=1475,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Attachment: image-2020-09-23-17-53-23-696.png

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
> Attachments: image-2020-09-23-17-53-23-696.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 

!image-2020-09-23-17-53-23-696.png|width=1475,height=416!

  was:
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page.

!image-2020-09-23-17-53-23-696.png|width=1475,height=416!


> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
> Attachments: image-2020-09-23-17-53-23-696.png
>
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  
> !image-2020-09-23-17-53-23-696.png|width=1475,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira
Zoltán Borók-Nagy created IMPALA-10187:
--

 Summary: Event processing fails on multiple events + DROP TABLE
 Key: IMPALA-10187
 URL: https://issues.apache.org/jira/browse/IMPALA-10187
 Project: IMPALA
  Issue Type: Bug
Reporter: Zoltán Borók-Nagy


I've seen the following during interop testing:

Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
table.

Then CatalogD's event processor tried to refresh the table in its cache:
{noformat}
I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block metadata 
for 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
taken: 55.145ms I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded 
metadata for: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 (303ms) I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] 
Received 41 events. Start event id : 39948 I0922 14:32:58.022266 10065 
MetastoreEvents.java:380] EventId: 39949 EventType: ALTER_PARTITION Creating 
event 39949 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 I0922 14:32:58.022769 10065 MetastoreEvents.java:380] EventId: 39950 
EventType: ALTER_PARTITION Creating event 39950 of type ALTER_PARTITION on 
table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 I0922 14:32:58.023175 10065 MetastoreEvents.java:380] EventId: 39951 
EventType: ALTER_PARTITION Creating event 39951 of type ALTER_PARTITION on 
table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 I0922 14:32:58.023567 10065 MetastoreEvents.java:380] EventId: 39952 
EventType: ALTER_PARTITION Creating event 39952 of type ALTER_PARTITION on 
table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 I0922 14:32:58.024046 10065 MetastoreEvents.java:380] EventId: 39959 
EventType: DROP_TABLE Creating event 39959 of type DROP_TABLE on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e_imp_a273e31c_f6bb_4433_9164_44762d599f8a
{noformat}
Impala tried to refresh the table on the first ALTER TABLE event, but since 
it's been already dropped we get a TableLoadingException (caused by 
NoSuchObjectException from HMS):
{noformat}
I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
received: 41 Total number of events filtered out: 0 I0922 14:32:58.028962 10065 
CatalogServiceCatalog.java:862] Not a self-event since the given version is -1 
and service id is I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] 
Refreshing table metadata: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
exception received while processing event Java exception follows: 
org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
process event 39949 of type ALTER_PARTITION. Event processing will be stopped. 
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
 at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748) Caused by: 
org.apache.impala.catalog.TableLoadingException: Error loading metadata for 
table: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
 at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
 at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
 at 
org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
 at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
 at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615

[jira] [Updated] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-10187:
---
Description: 
I've seen the following during interop testing:

Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
table.

Then CatalogD's event processor tried to refresh the table in its cache:
{noformat}
I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block metadata 
for 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
taken: 55.145ms
I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 (303ms)
I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
events. Start event id : 39948
I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 EventType: 
ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.022769 10065 MetastoreEvents.java:380] EventId: 39950 EventType: 
ALTER_PARTITION Creating event 39950 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.023175 10065 MetastoreEvents.java:380] EventId: 39951 EventType: 
ALTER_PARTITION Creating event 39951 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.023567 10065 MetastoreEvents.java:380] EventId: 39952 EventType: 
ALTER_PARTITION Creating event 39952 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.024046 10065 MetastoreEvents.java:380] EventId: 39959 EventType: 
DROP_TABLE Creating event 39959 of type DROP_TABLE on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e_imp_a273e31c_f6bb_4433_9164_44762d599f8a
{noformat}
 

Impala tried to refresh the table on the first ALTER TABLE event, but since 
it's been already dropped we get a TableLoadingException (caused by 
NoSuchObjectException from HMS):

 
{noformat}
I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
received: 41 Total number of events filtered out: 0
I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
since the given version is -1 and service id is
I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
metadata: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
exception received while processing event
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
metadata for table: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
at 
org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
... 8 more
Caused by: 
NoSu

[jira] [Closed] (IMPALA-10051) impala-shell exits with ValueError with WITH clauses

2020-09-23 Thread Tamas Mate (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Mate closed IMPALA-10051.
---
Resolution: Fixed

> impala-shell exits with ValueError with WITH clauses
> 
>
> Key: IMPALA-10051
> URL: https://issues.apache.org/jira/browse/IMPALA-10051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> 
> impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
> if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
> shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
> if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
> return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
> tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
> return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
> token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
> raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 

 
{code:java}
// code placeholder
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
 

 

  was:
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 

!image-2020-09-23-17-53-23-696.png|width=1475,height=416!


> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
> Attachments: image-2020-09-23-17-53-23-696.png
>
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  
>  
> {code:java}
> // code placeholder
>   // Write data pages
>   for (const DataPage& page : pages_) {
> parquet::PageLocation location;if 
> (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-10187:
---
Description: 
I've seen the following during interop testing:

Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
table.

Then CatalogD's event processor tried to process the new events:
{noformat}
I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block metadata 
for 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
taken: 55.145ms
I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 (303ms)
I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
events. Start event id : 39948
I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 EventType: 
ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.022769 10065 MetastoreEvents.java:380] EventId: 39950 EventType: 
ALTER_PARTITION Creating event 39950 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.023175 10065 MetastoreEvents.java:380] EventId: 39951 EventType: 
ALTER_PARTITION Creating event 39951 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.023567 10065 MetastoreEvents.java:380] EventId: 39952 EventType: 
ALTER_PARTITION Creating event 39952 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.024046 10065 MetastoreEvents.java:380] EventId: 39959 EventType: 
DROP_TABLE Creating event 39959 of type DROP_TABLE on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e_imp_a273e31c_f6bb_4433_9164_44762d599f8a
{noformat}
 

Impala tried to refresh the table on the first ALTER TABLE event, but since 
it's been already dropped we get a TableLoadingException (caused by 
NoSuchObjectException from HMS):

 
{noformat}
I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
received: 41 Total number of events filtered out: 0
I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
since the given version is -1 and service id is
I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
metadata: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
exception received while processing event
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
metadata for table: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
at 
org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
... 8 more
Caused by: 
NoSuchObject

[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.

 

 

  was:
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 

 
{code:java}
// code placeholder
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
 

 


> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
> Attachments: image-2020-09-23-17-53-23-696.png
>
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> parquet::PageLocation location;if 
> (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200706#comment-17200706
 ] 

Zoltán Borók-Nagy commented on IMPALA-10187:


 [~vihangk1], could you please take a look at this bug?

> Event processing fails on multiple events + DROP TABLE
> --
>
> Key: IMPALA-10187
> URL: https://issues.apache.org/jira/browse/IMPALA-10187
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>
> I've seen the following during interop testing:
> Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
> table.
> Then CatalogD's event processor tried to process the new events:
> {noformat}
> I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block 
> metadata for 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
> taken: 55.145ms
> I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  (303ms)
> I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
> events. Start event id : 39948
> I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 
> EventType: ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on 
> table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> I0922 14:32:58.022769 10065 MetastoreEvents.java:380] EventId: 39950 
> EventType: ALTER_PARTITION Creating event 39950 of type ALTER_PARTITION on 
> table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> I0922 14:32:58.023175 10065 MetastoreEvents.java:380] EventId: 39951 
> EventType: ALTER_PARTITION Creating event 39951 of type ALTER_PARTITION on 
> table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> I0922 14:32:58.023567 10065 MetastoreEvents.java:380] EventId: 39952 
> EventType: ALTER_PARTITION Creating event 39952 of type ALTER_PARTITION on 
> table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> I0922 14:32:58.024046 10065 MetastoreEvents.java:380] EventId: 39959 
> EventType: DROP_TABLE Creating event 39959 of type DROP_TABLE on table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e_imp_a273e31c_f6bb_4433_9164_44762d599f8a
> {noformat}
>  
> Impala tried to refresh the table on the first ALTER TABLE event, but since 
> it's been already dropped we get a TableLoadingException (caused by 
> NoSuchObjectException from HMS):
>  
> {noformat}
> I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
> received: 41 Total number of events filtered out: 0
> I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
> since the given version is -1 and service id is
> I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
> metadata: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
> exception received while processing event
> Java exception follows:
> org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
> process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
> metadata for table: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
> at 
> org.apache.impala.catalo

[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.
{code:java}
bool ComputeCandidatePages(
const vector& page_locations,
const vector& candidate_ranges,
const int64_t num_rows, vector* candidate_pages) {
  if (!ValidatePageLocations(page_locations, num_rows)) return false
{code}
so Issue 

 

 

  was:
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.

 

 


> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
> Attachments: image-2020-09-23-17-53-23-696.png
>
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> parquet::PageLocation location;if 
> (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> so Issue 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-10187:
---
Description: 
I've seen the following during interop testing:

Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
table.

Then CatalogD's event processor tried to process the new events:
{noformat}
I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block metadata 
for 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
taken: 55.145ms
I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 (303ms)
I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
events. Start event id : 39948
I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 EventType: 
ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.022769 10065 MetastoreEvents.java:380] EventId: 39950 EventType: 
ALTER_PARTITION Creating event 39950 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.023175 10065 MetastoreEvents.java:380] EventId: 39951 EventType: 
ALTER_PARTITION Creating event 39951 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.023567 10065 MetastoreEvents.java:380] EventId: 39952 EventType: 
ALTER_PARTITION Creating event 39952 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.024046 10065 MetastoreEvents.java:380] EventId: 39959 EventType: 
DROP_TABLE Creating event 39959 of type DROP_TABLE on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
{noformat}
 

Impala tried to refresh the table on the first ALTER TABLE event, but since 
it's been already dropped we get a TableLoadingException (caused by 
NoSuchObjectException from HMS):

 
{noformat}
I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
received: 41 Total number of events filtered out: 0
I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
since the given version is -1 and service id is
I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
metadata: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
exception received while processing event
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
metadata for table: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
at 
org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
... 8 more
Caused by: 
NoSuchObjectException(message:hive.default.insertonly

[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Attachment: (was: image-2020-09-23-17-53-23-696.png)

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> parquet::PageLocation location;if 
> (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.
{code:java}
bool ComputeCandidatePages(
const vector& page_locations,
const vector& candidate_ranges,
const int64_t num_rows, vector* candidate_pages) {
  if (!ValidatePageLocations(page_locations, num_rows)) return false
{code}
and then cause  IMPALA-9952

 

  was:
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.
{code:java}
bool ComputeCandidatePages(
const vector& page_locations,
const vector& candidate_ranges,
const int64_t num_rows, vector* candidate_pages) {
  if (!ValidatePageLocations(page_locations, num_rows)) return false
{code}
so Issue 

 

 


> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> parquet::PageLocation location;if 
> (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-10187:
---
Description: 
I've seen the following during interop testing:

Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
table.

Then CatalogD's event processor tried to process the new events:
{noformat}
I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block metadata 
for 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
taken: 55.145ms
I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 (303ms)
I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
events. Start event id : 39948
I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 EventType: 
ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
I0922 14:32:58.024389 10065 MetastoreEvents.java:380] EventId: 39962 EventType: 
DROP_TABLE Creating event 39962 of type DROP_TABLE on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
{noformat}
 

Impala tried to refresh the table on the first ALTER TABLE event, but since 
it's been already dropped we get a TableLoadingException (caused by 
NoSuchObjectException from HMS):

 
{noformat}
I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
received: 41 Total number of events filtered out: 0
I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
since the given version is -1 and service id is
I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
metadata: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
exception received while processing event
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
metadata for table: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
at 
org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
... 8 more
Caused by: 
NoSuchObjectException(message:hive.default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 table not found)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:2350)
 

[jira] [Updated] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy updated IMPALA-10187:
---
Description: 
I've seen the following during interop testing:

Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
table.

Then CatalogD's event processor tried to process the new events:
{noformat}
I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block metadata 
for 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
taken: 55.145ms
I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 (303ms)
I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
events. Start event id : 39948
I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 EventType: 
ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
...
I0922 14:32:58.024389 10065 MetastoreEvents.java:380] EventId: 39962 EventType: 
DROP_TABLE Creating event 39962 of type DROP_TABLE on table 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
{noformat}
 

Impala tried to refresh the table on the first ALTER TABLE event, but since 
it's been already dropped we get a TableLoadingException (caused by 
NoSuchObjectException from HMS):

 
{noformat}
I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
received: 41 Total number of events filtered out: 0
I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
since the given version is -1 and service id is
I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
metadata: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
exception received while processing event
Java exception follows:
org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
metadata for table: 
default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
at 
org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
at 
org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
at 
org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
... 8 more
Caused by: 
NoSuchObjectException(message:hive.default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
 table not found)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result$get_table_req_resultStandardScheme.read(ThriftHiveMetastore.java)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_req_result.read(ThriftHiveMetastore.java)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_req(ThriftHiveMetastore.java:235

[jira] [Commented] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200708#comment-17200708
 ] 

guojingfeng commented on IMPALA-10186:
--

[~boroknagyz]  in which case we meet empty page when write parquet files ?

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> parquet::PageLocation location;if 
> (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-10184) Iceberg PARTITION SPEC missing from SHOW CREATE TABLE

2020-09-23 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-10184 started by Gabor Kaszab.
-
> Iceberg PARTITION SPEC missing from SHOW CREATE TABLE
> -
>
> Key: IMPALA-10184
> URL: https://issues.apache.org/jira/browse/IMPALA-10184
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> The PARTITION SPEC is missing from the SHOW CREATE TABLE output for Iceberg 
> tables.
> This is how I created a table:
> {code:java}
> create table iceberg_tmp2 (
>   i int, 
>   s string, 
>   p1 string,
>   p2 timestamp
> ) 
> partition by spec (
>   p1 identity, 
>   p2 Day
> ) 
> stored as iceberg;
> {code}
> And this is the output of SHOW CREATE TABLE for the same table:
> {code:java}
> +---+
> | CREATE EXTERNAL TABLE default.iceberg_tmp2 (
> |   i INT,
> |   s STRING, 
> |   p1 STRING,
> |   p2 TIMESTAMP
> | )
> | STORED AS ICEBERG 
> | LOCATION 'hdfs://localhost:20500/test-warehouse/iceberg_tmp2'   
>  | TBLPROPERTIES 
> ('OBJCAPABILITIES'='EXTREAD,EXTWRITE', 'external.table.purge'='TRUE', 
> 'iceberg_file_format'='parquet')
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10184) Iceberg PARTITION SPEC missing from SHOW CREATE TABLE

2020-09-23 Thread Gabor Kaszab (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Kaszab reassigned IMPALA-10184:
-

Assignee: Gabor Kaszab

> Iceberg PARTITION SPEC missing from SHOW CREATE TABLE
> -
>
> Key: IMPALA-10184
> URL: https://issues.apache.org/jira/browse/IMPALA-10184
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Gabor Kaszab
>Assignee: Gabor Kaszab
>Priority: Major
>  Labels: impala-iceberg
>
> The PARTITION SPEC is missing from the SHOW CREATE TABLE output for Iceberg 
> tables.
> This is how I created a table:
> {code:java}
> create table iceberg_tmp2 (
>   i int, 
>   s string, 
>   p1 string,
>   p2 timestamp
> ) 
> partition by spec (
>   p1 identity, 
>   p2 Day
> ) 
> stored as iceberg;
> {code}
> And this is the output of SHOW CREATE TABLE for the same table:
> {code:java}
> +---+
> | CREATE EXTERNAL TABLE default.iceberg_tmp2 (
> |   i INT,
> |   s STRING, 
> |   p1 STRING,
> |   p2 TIMESTAMP
> | )
> | STORED AS ICEBERG 
> | LOCATION 'hdfs://localhost:20500/test-warehouse/iceberg_tmp2'   
>  | TBLPROPERTIES 
> ('OBJCAPABILITIES'='EXTREAD,EXTWRITE', 'external.table.purge'='TRUE', 
> 'iceberg_file_format'='parquet')
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

guojingfeng updated IMPALA-10186:
-
Description: 
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
if (page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.
{code:java}
bool ComputeCandidatePages(
const vector& page_locations,
const vector& candidate_ranges,
const int64_t num_rows, vector* candidate_pages) {
  if (!ValidatePageLocations(page_locations, num_rows)) return false
{code}
and then cause  IMPALA-9952

 

  was:
Current parquet writer write -1 of PageLocation.offset and 
PageLocation.first_row_index when meet a empty page. 

 hdfs-parquet-file-writer.cc  Line: 808 ~ 819
{code:java}
  // Write data pages
  for (const DataPage& page : pages_) {
parquet::PageLocation location;if 
(page.header.data_page_header.num_values == 0) {
  // Skip empty pages
  location.offset = -1;
  location.compressed_page_size = 0;
  location.first_row_index = -1;
  AddLocationToOffsetIndex(location);
  continue;
}
{code}
But -1 values may cause   ComputeCandidatePages function run into unexpected 
status.
{code:java}
bool ComputeCandidatePages(
const vector& page_locations,
const vector& candidate_ranges,
const int64_t num_rows, vector* candidate_pages) {
  if (!ValidatePageLocations(page_locations, num_rows)) return false
{code}
and then cause  IMPALA-9952

 


> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10051) impala-shell exits with ValueError with WITH clauses

2020-09-23 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-10051.

Fix Version/s: Impala 4.0
   Resolution: Fixed

> impala-shell exits with ValueError with WITH clauses
> 
>
> Key: IMPALA-10051
> URL: https://issues.apache.org/jira/browse/IMPALA-10051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
> Fix For: Impala 4.0
>
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> 
> impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
> if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
> shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
> if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
> return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
> tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
> return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
> token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
> raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Reopened] (IMPALA-10051) impala-shell exits with ValueError with WITH clauses

2020-09-23 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reopened IMPALA-10051:


Reopening to set a fix version

> impala-shell exits with ValueError with WITH clauses
> 
>
> Key: IMPALA-10051
> URL: https://issues.apache.org/jira/browse/IMPALA-10051
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 4.0
>Reporter: Tamas Mate
>Assignee: Tamas Mate
>Priority: Major
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> 
> impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
> if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
> shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
> if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
> return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
> tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
> return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
> token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
> raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Vihang Karajgaonkar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned IMPALA-10187:


Assignee: Vihang Karajgaonkar

> Event processing fails on multiple events + DROP TABLE
> --
>
> Key: IMPALA-10187
> URL: https://issues.apache.org/jira/browse/IMPALA-10187
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> I've seen the following during interop testing:
> Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
> table.
> Then CatalogD's event processor tried to process the new events:
> {noformat}
> I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block 
> metadata for 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
> taken: 55.145ms
> I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  (303ms)
> I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
> events. Start event id : 39948
> I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 
> EventType: ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on 
> table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> ...
> I0922 14:32:58.024389 10065 MetastoreEvents.java:380] EventId: 39962 
> EventType: DROP_TABLE Creating event 39962 of type DROP_TABLE on table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> {noformat}
>  
> Impala tried to refresh the table on the first ALTER TABLE event, but since 
> it's been already dropped we get a TableLoadingException (caused by 
> NoSuchObjectException from HMS):
>  
> {noformat}
> I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
> received: 41 Total number of events filtered out: 0
> I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
> since the given version is -1 and service id is
> I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
> metadata: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
> exception received while processing event
> Java exception follows:
> org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
> process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
> metadata for table: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
> ... 8 more
> Caused by: 
> NoSuchObjectException(message:hive.default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  table not found)
> at 
> org.apache.hadoop.hive.metastore.api.Thrift

[jira] [Commented] (IMPALA-10187) Event processing fails on multiple events + DROP TABLE

2020-09-23 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200990#comment-17200990
 ] 

Vihang Karajgaonkar commented on IMPALA-10187:
--

Ideally we should be ignoring the NoSuchObjectException. I will take a look. 
Thanks!

> Event processing fails on multiple events + DROP TABLE
> --
>
> Key: IMPALA-10187
> URL: https://issues.apache.org/jira/browse/IMPALA-10187
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> I've seen the following during interop testing:
> Some DDL statements (ALTER TABLE + DROP) were executed via Hive on the same 
> table.
> Then CatalogD's event processor tried to process the new events:
> {noformat}
> I0922 14:32:56.590229 13611 HdfsTable.java:709] Loaded file and block 
> metadata for 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  partitions: category=cat1, category=cat2, category=cat3, and 1 others. Time 
> taken: 55.145ms
> I0922 14:32:56.591078 13611 TableLoader.java:103] Loaded metadata for: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
>  (303ms)
> I0922 14:32:58.022068 10065 MetastoreEventsProcessor.java:482] Received 41 
> events. Start event id : 39948
> I0922 14:32:58.022266 10065 MetastoreEvents.java:380] EventId: 39949 
> EventType: ALTER_PARTITION Creating event 39949 of type ALTER_PARTITION on 
> table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> ...
> I0922 14:32:58.024389 10065 MetastoreEvents.java:380] EventId: 39962 
> EventType: DROP_TABLE Creating event 39962 of type DROP_TABLE on table 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> {noformat}
>  
> Impala tried to refresh the table on the first ALTER TABLE event, but since 
> it's been already dropped we get a TableLoadingException (caused by 
> NoSuchObjectException from HMS):
>  
> {noformat}
> I0922 14:32:58.028852 10065 MetastoreEvents.java:234] Total number of events 
> received: 41 Total number of events filtered out: 0
> I0922 14:32:58.028962 10065 CatalogServiceCatalog.java:862] Not a self-event 
> since the given version is -1 and service id is
> I0922 14:32:58.029369 10065 CatalogServiceCatalog.java:2142] Refreshing table 
> metadata: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> E0922 14:32:58.038627 10065 MetastoreEventsProcessor.java:527] Unexpected 
> exception received while processing event
> Java exception follows:
> org.apache.impala.catalog.events.MetastoreNotificationException: Unable to 
> process event 39949 of type ALTER_PARTITION. Event processing will be stopped.
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:620)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:513)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.TableLoadingException: Error loading 
> metadata for table: 
> default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8_4c7a_b1c7_0b8f4c42c61e
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:2160)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.reloadTableIfExists(CatalogServiceCatalog.java:2365)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreTableEvent.reloadTableFromCatalog(MetastoreEvents.java:563)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$AlterPartitionEvent.process(MetastoreEvents.java:1454)
> at 
> org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:314)
> at 
> org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:615)
> ... 8 more
> Caused by: 
> NoSuchObjectException(message:hive.default.insertonly_hiveclient_impalaclient_partitioned_8ff3a1ef_b8a8

[jira] [Assigned] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy reassigned IMPALA-10186:
--

Assignee: Zoltán Borók-Nagy

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9952) Invalid offset index in Parquet file

2020-09-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/IMPALA-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy reassigned IMPALA-9952:
-

Assignee: Zoltán Borók-Nagy

>  Invalid offset index in Parquet file
> -
>
> Key: IMPALA-9952
> URL: https://issues.apache.org/jira/browse/IMPALA-9952
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.4.0
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>  Labels: Parquet
>
> When reading parquet file in impala 3.4, encountered the following error:
> {code:java}
> I0714 16:11:48.307806 1075820 runtime-state.cc:207] 
> 8c43203adb2d4fc8:0478df9b018b] Error from query 
> 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> I0714 16:11:48.834901 1075838 status.cc:126] 
> 8c43203adb2d4fc8:0478df9b02c0] Invalid offset index in Parquet file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> @   0xbf4ef9
> @  0x1748c41
> @  0x174e170
> @  0x1750e58
> @  0x17519f0
> @  0x1748559
> @  0x1510b41
> @  0x1512c8f
> @  0x137488a
> @  0x1375759
> @  0x1b48a19
> @ 0x7f34509f5e24
> @ 0x7f344d5ed35c
> I0714 16:11:48.835763 1075838 runtime-state.cc:207] 
> 8c43203adb2d4fc8:0478df9b02c0] Error from query 
> 8c43203adb2d4fc8:0478df9b: Invalid offset index in Parquet file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> I0714 16:11:48.893784 1075820 status.cc:126] 
> 8c43203adb2d4fc8:0478df9b018b] Top level rows aren't in sync during page 
> filtering in file 
> hdfs://path/4844de7af4545a39-e8ebc7da005f_2015704758_data.0.parq.
> @   0xbf4ef9
> @  0x1749104
> @  0x17494cc
> @  0x1751aee
> @  0x1748559
> @  0x1510b41
> @  0x1512c8f
> @  0x137488a
> @  0x1375759
> @  0x1b48a19
> @ 0x7f34509f5e24
> @ 0x7f344d5ed35c
> {code}
>  Corresponding source code:
> {code:java}
> Status HdfsParquetScanner::CheckPageFiltering() {
>   if (candidate_ranges_.empty() || scalar_readers_.empty()) return 
> Status::OK();  int64_t current_row = scalar_readers_[0]->LastProcessedRow();
>   for (int i = 1; i < scalar_readers_.size(); ++i) {
> if (current_row != scalar_readers_[i]->LastProcessedRow()) {
>   DCHECK(false);
>   return Status(Substitute(
>   "Top level rows aren't in sync during page filtering in file $0.", 
> filename()));
> }
>   }
>   return Status::OK();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201001#comment-17201001
 ] 

Zoltán Borók-Nagy commented on IMPALA-10186:


Thanks for your analysis! I was able to reproduce the error on a Parquet file 
written by a modified Impala parquet writer.

IMPALA-4371 explains how empty pages can occur.

I'm going to upload a fix tomorrow.

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10183) Hit promise DCHECK while looping result spooling tests

2020-09-23 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201044#comment-17201044
 ] 

Sahil Takiar commented on IMPALA-10183:
---

Thanks for reporting and fixing this!

> Hit promise DCHECK while looping result spooling tests
> --
>
> Key: IMPALA-10183
> URL: https://issues.apache.org/jira/browse/IMPALA-10183
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Quanlong Huang
>Priority: Major
> Attachments: impalad.ERROR.gz, impalad.FATAL.gz, impalad.INFO.gz
>
>
> {noformat}
> while impala-py.test tests/query_test/test_result_spooling.py -n4 ; do date; 
> done
> {noformat}
> {noformat}
> F0921 10:14:35.355281  5842 promise.h:61] Check failed: mode == 
> PromiseMode::MULTIPLE_PRODUCER [ mode = 0 , PromiseM
> ode::MULTIPLE_PRODUCER = 1 ]Called Set(..) twice on the same Promise in 
> SINGLE_PRODUCER mode
> *** Check failure stack trace: ***
> @  0x52087fc  google::LogMessage::Fail()
> @  0x520a0ec  google::LogMessage::SendToLog()
> @  0x520815a  google::LogMessage::Flush()
> @  0x520bd58  google::LogMessageFatal::~LogMessageFatal()
> @  0x223cc50  impala::Promise<>::Set()
> @  0x293f21d  impala::BufferedPlanRootSink::Cancel()
> @  0x2317856  impala::FragmentInstanceState::Cancel()
> @  0x2284c62  impala::QueryState::Cancel()
> @  0x2464728  impala::ControlService::CancelQueryFInstances()
> @  0x253df37  
> _ZZN6impala16ControlServiceIfC4ERK13scoped_refptrIN4kudu12MetricEntityEERKS1_INS2_3rpc13Re
> sultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE4_clESG_SH_SJ_
> @  0x253fb65  
> _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZN6imp
> ala16ControlServiceIfC4ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E4_E9_M_invokeE
> RKSt9_Any_dataOS4_OS5_OS9_
> @  0x2c9612f  std::function<>::operator()()
> @  0x2c95ade  kudu::rpc::GeneratedServiceIf::Handle()
> @  0x21d8c55  impala::ImpalaServicePool::RunThread()
> @  0x21de836  boost::_mfi::mf0<>::operator()()
> @  0x21de468  boost::_bi::list1<>::operator()<>()
> @  0x21de02e  boost::_bi::bind_t<>::operator()()
> @  0x21ddaa5  
> boost::detail::function::void_function_obj_invoker0<>::invoke()
> @  0x2140b55  boost::function0<>::operator()()
> @  0x271e1a9  impala::Thread::SuperviseThread()
> @  0x2726146  boost::_bi::list5<>::operator()<>()
> @  0x272606a  boost::_bi::bind_t<>::operator()()
> @  0x272602b  boost::detail::thread_data<>::run()
> @  0x3f0f621  thread_proxy
> @ 0x7f4db3f356da  start_thread
> @ 0x7f4db092ca3e  clone
> Wrote minidump to 
> /home/tarmstrong/Impala/impala/logs/cluster/minidumps/impalad/3204ffe5-6905-4842-d702c395-21c4eca5
> .dmp
> (END)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10102) Impalad crashses when writting a parquet file with large rows

2020-09-23 Thread Abhishek Rawat (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat reassigned IMPALA-10102:
---

Assignee: Yida Wu

> Impalad crashses when writting a parquet file with large rows
> -
>
> Key: IMPALA-10102
> URL: https://issues.apache.org/jira/browse/IMPALA-10102
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Yida Wu
>Priority: Critical
>  Labels: crash
>
> Encountered a crash when testing following queries on my local branch:
> {code:sql}
> create table bigstrs3 stored as parquet as
> select *, repeat(uuid(), cast(random() * 20 as int)) as bigstr
> from functional.alltypes
> limit 1000;
> # Length of uuid() is 36. So the max row size is 7,200,000.
> set MAX_ROW_SIZE=8m;
> create table my_str_group stored as parquet as
>   select group_concat(string_col) as ss, bigstr
>   from bigstrs3 group by bigstr;
> create table my_cnt stored as parquet as
>   select count(*) as cnt, bigstr
>   from bigstrs3 group by bigstr;
> {code}
> The crash stacktrace:
> {code}
> Crash reason:  SIGSEGV
> Crash address: 0x0
> Process uptime: not available
> Thread 336 (crashed)
>  0  libc-2.23.so + 0x14e10b
>  1  impalad!snappy::UncheckedByteArraySink::Append(char const*, unsigned 
> long) [clone .localalias.0] + 0x1a 
>  2  impalad!snappy::Compress(snappy::Source*, snappy::Sink*) + 0xb1 
>  3  impalad!snappy::RawCompress(char const*, unsigned long, char*, unsigned 
> long*) + 0x51 
>  4  impalad!impala::SnappyCompressor::ProcessBlock(bool, long, unsigned char 
> const*, long*, unsigned char**) [compress.cc : 295 + 0x24]
>  5  impalad!impala::Codec::ProcessBlock32(bool, int, unsigned char const*, 
> int*, unsigned char**) [codec.cc : 211 + 0x41]
>  6  impalad!impala::HdfsParquetTableWriter::BaseColumnWriter::Flush(long*, 
> long*, long*) [hdfs-parquet-table-writer.cc : 775 + 0x56]
>  7  impalad!impala::HdfsParquetTableWriter::FlushCurrentRowGroup() 
> [hdfs-parquet-table-writer.cc : 1330 + 0x60]
>  8  impalad!impala::HdfsParquetTableWriter::Finalize() 
> [hdfs-parquet-table-writer.cc : 1297 + 0x19]
>  9  
> impalad!impala::HdfsTableSink::FinalizePartitionFile(impala::RuntimeState*, 
> impala::OutputPartition*) [hdfs-table-sink.cc : 652 + 0x2e]
> 10  
> impalad!impala::HdfsTableSink::WriteRowsToPartition(impala::RuntimeState*, 
> impala::RowBatch*, std::pair std::default_delete >, std::vector std::allocator > >*) [hdfs-table-sink.cc : 282 + 0x21]
> 11  impalad!impala::HdfsTableSink::Send(impala::RuntimeState*, 
> impala::RowBatch*) [hdfs-table-sink.cc : 621 + 0x2e]
> 12  impalad!impala::FragmentInstanceState::ExecInternal() 
> [fragment-instance-state.cc : 422 + 0x58]
> 13  impalad!impala::FragmentInstanceState::Exec() [fragment-instance-state.cc 
> : 106 + 0x16]
> 14  impalad!impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) 
> [query-state.cc : 836 + 0x19]
> 15  impalad!impala::QueryState::StartFInstances()::{lambda()#1}::operator()() 
> const + 0x26 
> 16  
> impalad!boost::detail::function::void_function_obj_invoker0,
>  void>::invoke [function_template.hpp : 159 + 0xc] 
> 17  impalad!boost::function0::operator()() const [function_template.hpp 
> : 770 + 0x1d]
> 18  impalad!impala::Thread::SuperviseThread(std::__cxx11::basic_string std::char_traits, std::allocator > const&, 
> std::__cxx11::basic_string, std::allocator 
> > const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) [thread.cc : 360 + 0xf]
> 19  impalad!void 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> 
> >::operator() std::char_traits, std::allocator > const&, 
> std::__cxx11::basic_string, std::allocator 
> > const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*), 
> boost::_bi::list0>(boost::_bi::type, void 
> (*&)(std::__cxx11::basic_string, 
> std::allocator > const&, std::__cxx11::basic_string std::char_traits, std::allocator > const&, boost::function ()>, impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), boost::_bi::list0&, int) [bind.hpp : 531 + 0x15]
> 20  impalad!boost::_bi::bind_t (*)(std::__cxx11::basic_string, 
> std::allocator > const&, std::__cxx11::basic_string std::char_traits, std::allocator > const&, boost::function ()>, impala::ThreadDebugInfo const*, impala::Promise (impala::PromiseMode)0>*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > 
> >::operator()() [bind.hpp : 1222 + 0x22]
> 21  impalad!boost::detail::thread_data (*)(std::__cxx11::basic_string, 
> std::allocator > const&, std::__cxx11::basic_string std

[jira] [Created] (IMPALA-10188) Remove unused WebDAV functions

2020-09-23 Thread Abhishek Rawat (Jira)
Abhishek Rawat created IMPALA-10188:
---

 Summary: Remove unused WebDAV functions
 Key: IMPALA-10188
 URL: https://issues.apache.org/jira/browse/IMPALA-10188
 Project: IMPALA
  Issue Type: Improvement
Reporter: Abhishek Rawat
Assignee: Abhishek Rawat


"PROPFIND" and "MKCOL" seems unnecessary and can be removed from Impala's web 
server.

https://github.com/apache/impala/blob/master/be/src/thirdparty/squeasel/squeasel.c#L212



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10112) Consider skipping FpRateTooHigh() check for bloom filters

2020-09-23 Thread Riza Suminto (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201151#comment-17201151
 ] 

Riza Suminto commented on IMPALA-10112:
---

CR is here: [https://gerrit.cloudera.org/c/16499/]

> Consider skipping FpRateTooHigh() check for bloom filters
> -
>
> Key: IMPALA-10112
> URL: https://issues.apache.org/jira/browse/IMPALA-10112
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Riza Suminto
>Priority: Major
>  Labels: performance
>
> This check disables bloom filters on the sender side.
> It is inaccurate in cases where there are duplicate values of the filter key 
> on the build side. E.g. many-to-many join or a join with multiple keys. This 
> could be fixed with some effort, but is probably not worth it, because:
> * Partition filters are probably still worth evaluating even if there are 
> false positives, because it's cheap and eliminating a partition is still 
> beneficial.
> * Runtime filters are dynamically disabled on the scan side if they are 
> ineffective. I think we still also "evaluate" the always true filter, which 
> is cheaper than doing the hashing and bloom evaluation, but still not 
> entirely free.
> * The disabling is fairly unlikely to kick in for partitioned joins because 
> it's only applied to a small subset of the filter, before the Or() operation.
> So it's potentially harmful and only likely beneficial for broadcast join 
> filters, in which case it saves a small amount of scan CPU and, for global 
> filters, coordinator RPCs and broadcasting. It's unclear that the complexity 
> is worth it for this relatively small and uncertain benefit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9923) Data loading of TPC-DS ORC fails with "Fail to get checksum"

2020-09-23 Thread David Rorke (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201183#comment-17201183
 ] 

David Rorke commented on IMPALA-9923:
-

We're continuing to see multiple test failures on an almost continuous basis at 
this point from this issue.  I'm wondering if it would be better to simply 
remove orc format from the tests by default until this issue is resolved?  It 
seems like the cost of including orc at this point is greater than the benefit?

> Data loading of TPC-DS ORC fails with "Fail to get checksum"
> 
>
> Key: IMPALA-9923
> URL: https://issues.apache.org/jira/browse/IMPALA-9923
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Zoltán Borók-Nagy
>Priority: Critical
>  Labels: broken-build, flaky
> Attachments: load-tpcds-core-hive-generated-orc-def-block.sql, 
> load-tpcds-core-hive-generated-orc-def-block.sql.log
>
>
> {noformat}
> INFO  : Loading data to table tpcds_orc_def.store_sales partition 
> (ss_sold_date_sk=null) from 
> hdfs://localhost:20500/test-warehouse/managed/tpcds.store_sales_orc_def
> INFO  : 
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Fail to get 
> checksum, since file 
> /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2451646/base_003/_orc_acid_version
>  is under construction.
> INFO  : Completed executing 
> command(queryId=ubuntu_20200707055650_a1958916-1e85-4db5-b1bc-cc63d80b3537); 
> Time taken: 14.512 seconds
> INFO  : OK
> Error: Error while compiling statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.MoveTask. java.io.IOException: Fail to 
> get checksum, since file 
> /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2451646/base_003/_orc_acid_version
>  is under construction. (state=08S01,code=1)
> java.sql.SQLException: Error while compiling statement: FAILED: Execution 
> Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. 
> java.io.IOException: Fail to get checksum, since file 
> /test-warehouse/managed/tpcds.store_sales_orc_def/ss_sold_date_sk=2451646/base_003/_orc_acid_version
>  is under construction.
>   at 
> org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:401)
>   at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:266)
>   at org.apache.hive.beeline.Commands.executeInternal(Commands.java:1007)
>   at org.apache.hive.beeline.Commands.execute(Commands.java:1217)
>   at org.apache.hive.beeline.Commands.sql(Commands.java:1146)
>   at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1497)
>   at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1355)
>   at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:1329)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1127)
>   at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1082)
>   at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:546)
>   at org.apache.hive.beeline.BeeLine.main(BeeLine.java:528)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
> Closing: 0: jdbc:hive2://localhost:11050/default;auth=none
> {noformat}
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11223/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201255#comment-17201255
 ] 

guojingfeng edited comment on IMPALA-10186 at 9/24/20, 4:08 AM:


yeah, i inspect the parquet file which contains empty page: 
{code:java}

offset               first row index      compressed size     offset            
   first row index      compressed size     4201495              0              
      65553               4267048              34900                64711       
        -1                   -1                   0                   4331759   
           67200                43746  
{code}


was (Author: guojingfeng):
yeah, i inspect the parquet file which contains empty page: 

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201255#comment-17201255
 ] 

guojingfeng commented on IMPALA-10186:
--

yeah, i inspect the parquet file which contains empty page: 

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201255#comment-17201255
 ] 

guojingfeng edited comment on IMPALA-10186 at 9/24/20, 4:09 AM:


yeah, i inspect the parquet file which contains empty page: 
{code:java}
offset               first row index      compressed size     offset            
   first row index      compressed size     4201495              0              
      65553               4267048              34900                64711       
        -1                   -1                   0                   4331759   
           67200                43746  {code}


was (Author: guojingfeng):
yeah, i inspect the parquet file which contains empty page: 
{code:java}

offset               first row index      compressed size     offset            
   first row index      compressed size     4201495              0              
      65553               4267048              34900                64711       
        -1                   -1                   0                   4331759   
           67200                43746  
{code}

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-10186) Write invalid parquet PageLocations which table sort by some columns

2020-09-23 Thread guojingfeng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201255#comment-17201255
 ] 

guojingfeng edited comment on IMPALA-10186 at 9/24/20, 4:10 AM:


yeah, i inspect the parquet file which contains empty page: 
{code:java}
offset               first row index      compressed size     
4201495              0                    65553               
4267048              34900                64711               
-1                   -1                   0       // this page is empty         
   
4331759              67200                43746 {code}


was (Author: guojingfeng):
yeah, i inspect the parquet file which contains empty page: 
{code:java}
offset               first row index      compressed size     offset            
   first row index      compressed size     4201495              0              
      65553               4267048              34900                64711       
        -1                   -1                   0                   4331759   
           67200                43746  {code}

> Write invalid parquet PageLocations which table sort by some columns
> 
>
> Key: IMPALA-10186
> URL: https://issues.apache.org/jira/browse/IMPALA-10186
> Project: IMPALA
>  Issue Type: Bug
>Reporter: guojingfeng
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Current parquet writer write -1 of PageLocation.offset and 
> PageLocation.first_row_index when meet a empty page. 
>  hdfs-parquet-file-writer.cc  Line: 808 ~ 819
> {code:java}
>   // Write data pages
>   for (const DataPage& page : pages_) {
> if (page.header.data_page_header.num_values == 0) {
>   // Skip empty pages
>   location.offset = -1;
>   location.compressed_page_size = 0;
>   location.first_row_index = -1;
>   AddLocationToOffsetIndex(location);
>   continue;
> }
> {code}
> But -1 values may cause   ComputeCandidatePages function run into unexpected 
> status.
> {code:java}
> bool ComputeCandidatePages(
> const vector& page_locations,
> const vector& candidate_ranges,
> const int64_t num_rows, vector* candidate_pages) {
>   if (!ValidatePageLocations(page_locations, num_rows)) return false
> {code}
> and then cause  IMPALA-9952
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org