from:"Todd Lipcon \(JIRA\)"

[jira] [Commented] (IMPALA-9782) KuduPartitionExpr is not thread-safe

2020-06-02 Thread Todd Lipcon (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124304#comment-17124304
 ] 

Todd Lipcon commented on IMPALA-9782:
-

I seem to recall that Impala code has some faciltiies for exprs to get an 
execution-thread-local context, no? I had some patch last year to use this for 
re2 context objects, it seems like we could use the same for KuduPartitioner 
objects. In that case, something like Clone would be sufficient. I'm also not 
convinced that it would be problematic to instantiate multiple of them -- it 
appears like an RPC but the LookupTabletByKey call goes through a caching code 
path that's heavily optimized.

> KuduPartitionExpr is not thread-safe
> 
>
> Key: IMPALA-9782
> URL: https://issues.apache.org/jira/browse/IMPALA-9782
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>
> This is a blocker for supporting Kudu DML with mt_dop. The expression has 
> some mutable objects in the ScalarExpr object that needs to be moved to the 
> evaluator.
> KuduPartitioner is not thread-safe, because it uses an internal buffer. We 
> should consider making it thread-safe or clonable, because the initialisation 
> is expensive and does RPCs and things - 
> https://github.com/cloudera/kudu/blob/master/src/kudu/client/partitioner-internal.cc#L40
> KuduPartialRow is inherently not thread-safe because it contains thread-local 
> data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-3189) Address scalability issue with N^2 KDC requests on cluster startup

2019-12-06 Thread Todd Lipcon (Jira)



[ 
https://issues.apache.org/jira/browse/IMPALA-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990091#comment-16990091
 ] 

Todd Lipcon commented on IMPALA-3189:
-

This should be largely better with KRPC since we maintain long-running 
connections between nodes. Do people still see this issue on the first query 
after startup?

> Address scalability issue with N^2 KDC requests on cluster startup
> --
>
> Key: IMPALA-3189
> URL: https://issues.apache.org/jira/browse/IMPALA-3189
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Distributed Exec, Security
>Affects Versions: Impala 2.5.0
>Reporter: Henry Robinson
>Priority: Critical
>  Labels: kerberos, scalability
>
> When Impala runs a query that shuffles data amongst all nodes in a 
> Kerberos-secured cluster, every node will need to acquire a TGS for every 
> other node. In a cluster of 100 nodes or more, this can overwhelm the KDC, 
> and queries can exit with an error ("Could not contact KDC for realm").
> A simple workaround is to run a warm-up query until it succeeds (which can 
> take a few minutes after cluster startup). The KDC can also be scaled (e.g. 
> with secondary KDC nodes). 
> Impala can also consider either forcing a TGS request on start-up in a 
> staggered fashion, or we can move to recommending SSL + client certificates 
> for server<->server communication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8739) FileMetadataLoader skips empty directories

2019-07-08 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880769#comment-16880769
 ] 

Todd Lipcon commented on IMPALA-8739:
-

[~gopalv] how does Hive handle this case? The presence of an empty base 
directory is semantically relevant, but it seems like it isn't returned by the 
recursive listFiles() call. Do we need a new listFiles() recursive which also 
returns the directory entries?

> FileMetadataLoader skips empty directories
> --
>
> Key: IMPALA-8739
> URL: https://issues.apache.org/jira/browse/IMPALA-8739
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Priority: Major
>  Labels: impala-acid
>
> {{FileMetadataLoader}} has certain code paths like the one below which using 
> {{listFiles}} API on the filesystem. This API ignores empty directories which 
> is okay for non-transactional tables. However, in case of transactional table 
> an empty base directory provides writeId information which is used to skip 
> loading files which are not relevant for a given writeId. See 
> {{AcidUtils#filterFilesForAcidState}} usage for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8681) null ValidWriteIdLists written into profile

2019-07-08 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8681.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> null ValidWriteIdLists written into profile
> ---
>
> Key: IMPALA-8681
> URL: https://issues.apache.org/jira/browse/IMPALA-8681
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Todd Lipcon
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: impala-acid
> Fix For: Impala 3.3.0
>
>
> I see the following in the profile of a query on a non-ACID table:
> {code}  Loaded ValidWriteIdLists:
>null
>null
>null
>null
>null
>  : 1.0m (61744034316)
> {code} which is not useful and may confuse users



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7539) Support HDFS permissions checks with LocalCatalog

2019-07-01 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876527#comment-16876527
 ] 

Todd Lipcon commented on IMPALA-7539:
-

One consideration about this: this is actually a big performance hit for 
loading tables in the legacy catalog v1 mode.

I think we should consider doing this only for external tables that we see are 
outside of the warehouse directory. If a table's in the warehouse, we should 
just assume it's writable.

> Support HDFS permissions checks with LocalCatalog
> -
>
> Key: IMPALA-7539
> URL: https://issues.apache.org/jira/browse/IMPALA-7539
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: catalog-v2
>
> LocalTable currently stubs out checks for whether tables and partitions are 
> writable. This means that inserts into non-writable partitions will pass 
> planning and fail during execution. We should likely implement these checks, 
> or decide that this feature is unnecessary (it's quite expensive)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7438) Automated test for concurrent DDL and metadata queries

2019-07-01 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-7438.
-
   Resolution: Fixed
Fix Version/s: Impala 3.1.0

> Automated test for concurrent DDL and metadata queries
> --
>
> Key: IMPALA-7438
> URL: https://issues.apache.org/jira/browse/IMPALA-7438
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: catalog-v2
> Fix For: Impala 3.1.0
>
>
> The "localcatalog" implementation has some special provisions to ensure that 
> queries are planned on consistent snapshots, and any inconsistencies are 
> detected and resolved by retries. In IMPALA-7436 I tested this manually, but 
> we should write an automated regression test for this functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7438) Automated test for concurrent DDL and metadata queries

2019-07-01 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876523#comment-16876523
 ] 

Todd Lipcon commented on IMPALA-7438:
-

I think this is more or less covered by TestLocalCatalogRetries that Vuk and I 
worked on in 5cc49c343f8 and 94cfdac664c

> Automated test for concurrent DDL and metadata queries
> --
>
> Key: IMPALA-7438
> URL: https://issues.apache.org/jira/browse/IMPALA-7438
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Priority: Major
>  Labels: catalog-v2
>
> The "localcatalog" implementation has some special provisions to ensure that 
> queries are planned on consistent snapshots, and any inconsistencies are 
> detected and resolved by retries. In IMPALA-7436 I tested this manually, but 
> we should write an automated regression test for this functionality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8685) Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES

2019-06-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869735#comment-16869735
 ] 

Todd Lipcon commented on IMPALA-8685:
-

If we set it to 1 and it does cause scheduling skew, is there an easy metric 
that shows up in the profile that would allow us to detect the skew? I guess we 
have the per-scan-node range counts and bytes read-- should be sufficient, 
right? If so, we can document somewhere in our perf-tuning docs, etc, that if 
you see scanner skew on a remote data store, setting 
NUM_REMOTE_EXECUTOR_CANDIDATES to a higher value can reduce skew, at the 
expense of decreasing effective cache capacity.

I wonder if a rename of NUM_REMOTE_EXECUTOR_CANDIDATES would also be useful, if 
it's not too late (has this been released already?)

> Evaluate default configuration of NUM_REMOTE_EXECUTOR_CANDIDATES
> 
>
> Key: IMPALA-8685
> URL: https://issues.apache.org/jira/browse/IMPALA-8685
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Michael Ho
>Assignee: Joe McDonnell
>Priority: Critical
>
> The query option {{NUM_REMOTE_EXECUTOR_CANDIDATES}} is set to 3 by default. 
> This means that there are potentially 3 different executors which can process 
> a remote scan range. Over time, the data of a given remote scan range will be 
> spread across these 3 executors. My understanding of why this is not set to 1 
> is to avoid hot spots in pathological cases. On the other hand, this may mean 
> that we may not maximize the utilization of the file handle cache and data 
> cache. Also, for small clusters (e.g. a 3 node cluster), the default value 
> may render deterministic remote scan range scheduling ineffective. We may 
> want to re-evaluate the default value of {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. 
> One possible idea is to set it to min(3, half of cluster size) so it works 
> okay with small cluster, which may be rather common for demo purposes. 
> However, it doesn't address the problem of cache effectiveness in larger 
> clusters as the footprint of the cache is still amplified by 
> {{NUM_REMOTE_EXECUTOR_CANDIDATES}}. There may also be other criteria for 
> evaluating the default value.
> cc'ing [~joemcdonnell], [~tlipcon] and [~drorke]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8681) null ValidWriteIdLists written into profile

2019-06-21 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8681:
---

Assignee: Yongzhi Chen

Hey Yongzhi, mind taking a look?

> null ValidWriteIdLists written into profile
> ---
>
> Key: IMPALA-8681
> URL: https://issues.apache.org/jira/browse/IMPALA-8681
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Todd Lipcon
>Assignee: Yongzhi Chen
>Priority: Major
>  Labels: impala-acid
>
> I see the following in the profile of a query on a non-ACID table:
> {code}  Loaded ValidWriteIdLists:
>null
>null
>null
>null
>null
>  : 1.0m (61744034316)
> {code} which is not useful and may confuse users



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8667) Remove --pull_incremental_statistics flag (on by default)

2019-06-21 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8667.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Remove --pull_incremental_statistics flag (on by default)
> -
>
> Key: IMPALA-8667
> URL: https://issues.apache.org/jira/browse/IMPALA-8667
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
>  Labels: ramp-up
> Fix For: Impala 3.3.0
>
>
> This flag was introduced as a "chicken bit" so we could disable it if 
> something goes wrong. So far we haven't seen any issues. Let's remove the 
> flag and the old code paths so we don't need to maintain them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8606) GET_TABLES performance in local catalog mode

2019-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8606:
---

Assignee: (was: Todd Lipcon)

> GET_TABLES performance in local catalog mode
> 
>
> Key: IMPALA-8606
> URL: https://issues.apache.org/jira/browse/IMPALA-8606
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Balazs Jeszenszky
>Priority: Critical
>  Labels: catalog-v2
>
> With local catalog mode enabled, GET_TABLES JDBC requests will return more 
> than the always available table information. Any request for more metadata 
> about a table will trigger a full load of that table on the catalogd side, 
> meaning that GET_TABLES triggers the load of the entire catalog. Also, as far 
> as I can see, the requests for more metadata are made one table at a time. 
> Once the tables are loaded on the catalogd-side, a coordinator needs 3 
> roundtrips to the catalog to fetch all the details about a single table. My 
> test case had around 57k tables, 1700 DBs, and ~120k partitions. 
> GET_TABLES on a cold catalog takes 18 minutes. With a warm catalog, but cold 
> impalad, it still takes ~70 seconds.
> Many tools use GET_TABLES to populate dropdowns, etc. so this is bad for both 
> end user experience and catalog memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8606) GET_TABLES performance in local catalog mode

2019-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-8606:

Priority: Blocker  (was: Critical)

> GET_TABLES performance in local catalog mode
> 
>
> Key: IMPALA-8606
> URL: https://issues.apache.org/jira/browse/IMPALA-8606
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Balazs Jeszenszky
>Priority: Blocker
>  Labels: catalog-v2
>
> With local catalog mode enabled, GET_TABLES JDBC requests will return more 
> than the always available table information. Any request for more metadata 
> about a table will trigger a full load of that table on the catalogd side, 
> meaning that GET_TABLES triggers the load of the entire catalog. Also, as far 
> as I can see, the requests for more metadata are made one table at a time. 
> Once the tables are loaded on the catalogd-side, a coordinator needs 3 
> roundtrips to the catalog to fetch all the details about a single table. My 
> test case had around 57k tables, 1700 DBs, and ~120k partitions. 
> GET_TABLES on a cold catalog takes 18 minutes. With a warm catalog, but cold 
> impalad, it still takes ~70 seconds.
> Many tools use GET_TABLES to populate dropdowns, etc. so this is bad for both 
> end user experience and catalog memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7935) /catalog_object end point broken in LocalCatalog mode

2019-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-7935:

Priority: Minor  (was: Major)

> /catalog_object end point broken in LocalCatalog mode
> -
>
> Key: IMPALA-7935
> URL: https://issues.apache.org/jira/browse/IMPALA-7935
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: catalog-v2, observability
>
> Start Impala coordinator in LocalCatalog mode (v2)
> {code}
> start-impala-cluster.py -s 1 --impalad_args="--use_local_catalog=true" 
> --catalogd_args="--catalog_topic_mode=minimal"
> {code}
> Check the following URL.
> http://:25000/catalog_object?object_type=TABLE_name=functional_text_lzo.alltypesaggmultifiles
> It says,
> {noformat}
> Error: UnsupportedOperationException: LocalCatalog.getTCatalogObject
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8567) Many random catalog consistency issues with catalog v2/event processor

2019-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8567.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

Think this should be fixed now

> Many random catalog consistency issues with catalog v2/event processor
> --
>
> Key: IMPALA-8567
> URL: https://issues.apache.org/jira/browse/IMPALA-8567
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: broken-build, catalog, catalog-v2, flaky
> Fix For: Impala 3.3.0
>
>
> [~tlipcon] [~vihangk1] FYI. I'm not sure whether the local catalog or the 
> event processor is likely to blame here so I'll let you look. The general 
> theme is tables and databases not existing when they should.
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/289/testReport/junit/metadata.test_refresh_partition/TestRefreshPartition/test_drop_hive_partition_and_refresh_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/267/testReport/junit/query_test.test_kudu/TestKuduOperations/test_kudu_insert_protocol__beeswax___exec_optionkudu_read_modeREAD_AT_SNAPSHOTbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/286/testReport/junit/metadata.test_metadata_query_statements/TestMetadataQueryStatements/test_describe_db_protocol__beeswax___exec_optionsync_ddl___0___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/286/testReport/junit/metadata.test_hms_integration/TestHmsIntegrationSanity/test_sanity_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/288/testReport/junit/query_test.test_insert_parquet/TestHdfsParquetTableStatsWriter/test_write_statistics_multiple_row_groups_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> I'll include the output of each job in a follow-on comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7534) Handle invalidation races in CatalogdMetaProvider cache

2019-06-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-7534.
-
Resolution: Fixed

> Handle invalidation races in CatalogdMetaProvider cache
> ---
>
> Key: IMPALA-7534
> URL: https://issues.apache.org/jira/browse/IMPALA-7534
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Not Applicable
>
>
> There is a well-known race in Guava's LoadingCache that we are using for 
> CatalogdMetaProvider which we are not currently handling:
> - thread 1 gets a cache miss and makes a request to fetch some data from the 
> catalogd. It fetches the catalog object with version 1 and then gets context 
> switched out or otherwise slow
> - thread 2 receives an invalidation for the same object, because it has 
> changed to v2. It calls 'invalidate' on the cache, but nothing is yet cached.
> - thread 1 puts back v1 of the object into the cache
> In essence we've "missed" an invalidation. This is also described in this 
> nice post: https://softwaremill.com/race-condition-cache-guava-caffeine/
> The race is quite unlikely but could cause some unexpected results that are 
> hard to reason about, so we should look into a fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8681) null ValidWriteIdLists written into profile

2019-06-19 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8681:
---

 Summary: null ValidWriteIdLists written into profile
 Key: IMPALA-8681
 URL: https://issues.apache.org/jira/browse/IMPALA-8681
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.3.0
Reporter: Todd Lipcon


I see the following in the profile of a query on a non-ACID table:
{code}  Loaded ValidWriteIdLists:
   null
   null
   null
   null
   null
 : 1.0m (61744034316)
{code} which is not useful and may confuse users



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7615) Partition metadata mismatch should be handled gracefully in local catalog mode.

2019-06-18 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867118#comment-16867118
 ] 

Todd Lipcon commented on IMPALA-7615:
-

How can this happen? It seems like the request to load the partitions should be 
tied to a particular version number of the table, and if a partition has been 
removed, then the table's version number should have changed. So, it sounds 
like we've got some catalogd code which is being sloppy about locking or isn't 
bumping the table version number when it should be?

> Partition metadata mismatch should be handled gracefully in local catalog 
> mode.
> ---
>
> Key: IMPALA-7615
> URL: https://issues.apache.org/jira/browse/IMPALA-7615
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: bharath v
>Priority: Major
>
> *This is a Catalog v2 only improvement*
> An RPC to fetch partition metadata for a partition ID that does not exist on 
> the Catalog server currently throws IAE.
> {noformat}
> @Override
>   public TGetPartialCatalogObjectResponse getPartialInfo(
>   TGetPartialCatalogObjectRequest req) throws TableLoadingException {
>   for (long partId : partIds) {
> HdfsPartition part = partitionMap_.get(partId);
> Preconditions.checkArgument(part != null, "Partition id %s does not 
> exist",  <--
> partId);
> TPartialPartitionInfo partInfo = new TPartialPartitionInfo(partId);
> if (req.table_info_selector.want_partition_names) {
>   partInfo.setName(part.getPartitionName());
> }
> if (req.table_info_selector.want_partition_metadata) {
>   partInfo.hms_partition = part.toHmsPartition();
> {noformat}
> This is undesirable since such exceptions are not transparently retried in 
> the frontend. Instead we should fix this code path to throw 
> InconsistentMetadataException, similar to what we do for other code paths 
> that handle such inconsistent metadata like version changes.
> An example stack trace that hits this issue looks like follows,
> {noformat}
> org.apache.impala.catalog.local.LocalCatalogException: Could not load 
> partitions for table partition_level_tests.store_sales
> at 
> org.apache.impala.catalog.local.LocalFsTable.loadPartitions(LocalFsTable.java:399)
> at 
> org.apache.impala.catalog.FeCatalogUtils.loadAllPartitions(FeCatalogUtils.java:207)
> at 
> org.apache.impala.catalog.local.LocalFsTable.getMajorityFormat(LocalFsTable.java:244)
> at 
> org.apache.impala.planner.HdfsTableSink.computeResourceProfile(HdfsTableSink.java:75)
> at 
> org.apache.impala.planner.PlanFragment.computeResourceProfile(PlanFragment.java:233)
> at org.apache.impala.planner.Planner.computeResourceReqs(Planner.java:365)
> at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1020)
> at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1162)
> at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1077)
> at 
> org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
> Caused by: org.apache.thrift.TException: 
> TGetPartialCatalogObjectResponse(status:TStatus(status_code:GENERAL, 
> error_msgs:[IllegalArgumentException: Partition id 10084 does not exist]), 
> lookup_status:OK)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.sendRequest(CatalogdMetaProvider.java:322)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadPartitionsFromCatalogd(CatalogdMetaProvider.java:644)
> at 
> org.apache.impala.catalog.local.CatalogdMetaProvider.loadPartitionsByRefs(CatalogdMetaProvider.java:610)
> at 
> org.apache.impala.catalog.local.LocalFsTable.loadPartitions(LocalFsTable.java:395)
> ... 9 more{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8675) Remove db/table count metrics from impalad in LocalCatalog mode

2019-06-18 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8675:
---

 Summary: Remove db/table count metrics from impalad in 
LocalCatalog mode
 Key: IMPALA-8675
 URL: https://issues.apache.org/jira/browse/IMPALA-8675
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Reporter: Todd Lipcon


In LocalCatalog there is no need for every coordinator to have the full list of 
tables of every database. But, getCatalogMetrics ends up iterating over every 
DB and fetching these lists in order to provide a count. The count isn't 
particularly relevant -- if someone wants to keep track of the size of their 
catalog they are better off looking at that metric from catalogd. We should 
remove these catalog metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work stopped] (IMPALA-8631) Ensure that cached data is always up to date to avoid reads based on stale metadata for transactional read only tables

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8631 stopped by Todd Lipcon.
---
> Ensure that cached data is always up to date to avoid reads based on stale 
> metadata for transactional read only tables 
> ---
>
> Key: IMPALA-8631
> URL: https://issues.apache.org/jira/browse/IMPALA-8631
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Dinesh Garg
>Assignee: Todd Lipcon
>Priority: Major
>  Labels: impala-acid
>
> Acquire latest validWriteIdList in the coordinator and validate that the 
> cached data is up to date. Automatically force refresh with query if it’s not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-8667) Remove --pull_incremental_statistics flag (on by default)

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8667 started by Todd Lipcon.
---
> Remove --pull_incremental_statistics flag (on by default)
> -
>
> Key: IMPALA-8667
> URL: https://issues.apache.org/jira/browse/IMPALA-8667
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
>  Labels: ramp-up
>
> This flag was introduced as a "chicken bit" so we could disable it if 
> something goes wrong. So far we haven't seen any issues. Let's remove the 
> flag and the old code paths so we don't need to maintain them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-8567) Many random catalog consistency issues with catalog v2/event processor

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8567 started by Todd Lipcon.
---
> Many random catalog consistency issues with catalog v2/event processor
> --
>
> Key: IMPALA-8567
> URL: https://issues.apache.org/jira/browse/IMPALA-8567
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: broken-build, catalog, flaky
>
> [~tlipcon] [~vihangk1] FYI. I'm not sure whether the local catalog or the 
> event processor is likely to blame here so I'll let you look. The general 
> theme is tables and databases not existing when they should.
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/289/testReport/junit/metadata.test_refresh_partition/TestRefreshPartition/test_drop_hive_partition_and_refresh_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/267/testReport/junit/query_test.test_kudu/TestKuduOperations/test_kudu_insert_protocol__beeswax___exec_optionkudu_read_modeREAD_AT_SNAPSHOTbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/286/testReport/junit/metadata.test_metadata_query_statements/TestMetadataQueryStatements/test_describe_db_protocol__beeswax___exec_optionsync_ddl___0___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/286/testReport/junit/metadata.test_hms_integration/TestHmsIntegrationSanity/test_sanity_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/288/testReport/junit/query_test.test_insert_parquet/TestHdfsParquetTableStatsWriter/test_write_statistics_multiple_row_groups_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> I'll include the output of each job in a follow-on comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-7534) Handle invalidation races in CatalogdMetaProvider cache

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7534 started by Todd Lipcon.
---
> Handle invalidation races in CatalogdMetaProvider cache
> ---
>
> Key: IMPALA-7534
> URL: https://issues.apache.org/jira/browse/IMPALA-7534
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Not Applicable
>
>
> There is a well-known race in Guava's LoadingCache that we are using for 
> CatalogdMetaProvider which we are not currently handling:
> - thread 1 gets a cache miss and makes a request to fetch some data from the 
> catalogd. It fetches the catalog object with version 1 and then gets context 
> switched out or otherwise slow
> - thread 2 receives an invalidation for the same object, because it has 
> changed to v2. It calls 'invalidate' on the cache, but nothing is yet cached.
> - thread 1 puts back v1 of the object into the cache
> In essence we've "missed" an invalidation. This is also described in this 
> nice post: https://softwaremill.com/race-condition-cache-guava-caffeine/
> The race is quite unlikely but could cause some unexpected results that are 
> hard to reason about, so we should look into a fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8667) Remove --pull_incremental_statistics flag (on by default)

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8667:
---

Assignee: Todd Lipcon

> Remove --pull_incremental_statistics flag (on by default)
> -
>
> Key: IMPALA-8667
> URL: https://issues.apache.org/jira/browse/IMPALA-8667
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
>  Labels: ramp-up
>
> This flag was introduced as a "chicken bit" so we could disable it if 
> something goes wrong. So far we haven't seen any issues. Let's remove the 
> flag and the old code paths so we don't need to maintain them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7506) Support global INVALIDATE METADATA on fetch-on-demand impalad

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-7506:
---

Assignee: Todd Lipcon  (was: Vuk Ercegovac)

> Support global INVALIDATE METADATA on fetch-on-demand impalad
> -
>
> Key: IMPALA-7506
> URL: https://issues.apache.org/jira/browse/IMPALA-7506
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> There is some complexity with how this is implemented in the original code: 
> it depends on maintaining the minimum version of any object in the impalad's 
> local cache. We can't determine that in an on-demand impalad, so INVALIDATE 
> METADATA is not supported currently on "fetch-on-demand".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6876) Entries in CatalogUsageMonitor are not cleared after invalidation

2019-06-18 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866898#comment-16866898
 ] 

Todd Lipcon commented on IMPALA-6876:
-

Looked at this a few minutes today. There are a few more sketchy things about 
this class:
- the comparators for PQ entries reach into Table classes and get mutable 
fields like access count, which means that the PriorityQueue's heap invariant 
might be transiently broken. For example, while inserting table A, it will get 
compared against some existing heap entry B. Entry B might be concurrently 
mutated, and not yet updated in the heap, which means that it could be 
misplaced in the priority queue. Depending on the implementation of the PQ, 
breaking the invariant could cause either wrong results or an exception to be 
thrown.
- the "alwaysEvict" policy doesn't make much sense -- the original review says 
"I just wanted a cheap way to make sure that frequently accessed tables from 
the past didn't prevent newly accessed tables from ever being inserted in the 
cache. A more elaborate schemes could be time-based rank reduction or something 
like that." However, what it really accomplishes is that you get 24 MFU tables 
plus one MRU table tacked onto the end. ie it doesn't allow multiple 
recently-accessed tables to get into the cache, it just gets _one_ tacked onto 
the end.

> Entries in CatalogUsageMonitor are not cleared after invalidation
> -
>
> Key: IMPALA-6876
> URL: https://issues.apache.org/jira/browse/IMPALA-6876
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Priority: Major
>  Labels: memory-leak, ramp-up
>
> The CatalogUsageMonitor in the catalog maintains a small cache of references 
> to tables that: a) are accessed frequently in the catalog and b) have the 
> highest memory requirements. These entries are not cleared upon server or 
> table invalidation, thus preventing the GC from collecting the memory of 
> these tables. We should make sure that the CatalogUsageMonitor does not 
> maintain entries of tables that have been invalidated or deleted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8459) Cannot delete impala/kudu table if backing kudu table dropped with local catalog

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8459.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Cannot delete impala/kudu table if backing kudu table dropped with local 
> catalog
> 
>
> Key: IMPALA-8459
> URL: https://issues.apache.org/jira/browse/IMPALA-8459
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: kudu
> Fix For: Impala 3.3.0
>
>
> test_delete_external_kudu_table and test_delete_managed_kudu_table fail with 
> local catalog, e.g. with:
> {noformat}
> E   HiveServer2Error: LocalCatalogException: Error opening Kudu table 
> 'testimpalakuduintegration_1715_p3r46w.ogslbjblgv', Kudu error: the table 
> does not exist: table_name: "testimpalakuduintegration_1715_p3r46w.ogslbjblgv"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8542) Access trace collection for data cache

2019-06-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8542.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Access trace collection for data cache
> --
>
> Key: IMPALA-8542
> URL: https://issues.apache.org/jira/browse/IMPALA-8542
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Now that we have a remote-read data cache, it would be useful to log an 
> access trace. The trace can be then fed back into various cache policy 
> simulators to compare the relative performance, and do "what if" analysis 
> (how would hit rate react with larger/smaller capacities)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8567) Many random catalog consistency issues with catalog v2/event processor

2019-06-17 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8567:
---

Assignee: Todd Lipcon  (was: Vihang Karajgaonkar)

> Many random catalog consistency issues with catalog v2/event processor
> --
>
> Key: IMPALA-8567
> URL: https://issues.apache.org/jira/browse/IMPALA-8567
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: broken-build, catalog, flaky
>
> [~tlipcon] [~vihangk1] FYI. I'm not sure whether the local catalog or the 
> event processor is likely to blame here so I'll let you look. The general 
> theme is tables and databases not existing when they should.
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/289/testReport/junit/metadata.test_refresh_partition/TestRefreshPartition/test_drop_hive_partition_and_refresh_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/267/testReport/junit/query_test.test_kudu/TestKuduOperations/test_kudu_insert_protocol__beeswax___exec_optionkudu_read_modeREAD_AT_SNAPSHOTbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/286/testReport/junit/metadata.test_metadata_query_statements/TestMetadataQueryStatements/test_describe_db_protocol__beeswax___exec_optionsync_ddl___0___batch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/286/testReport/junit/metadata.test_hms_integration/TestHmsIntegrationSanity/test_sanity_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/288/testReport/junit/query_test.test_insert_parquet/TestHdfsParquetTableStatsWriter/test_write_statistics_multiple_row_groups_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___0___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__parquet_none_/
> I'll include the output of each job in a follow-on comment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7534) Handle invalidation races in CatalogdMetaProvider cache

2019-06-17 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866066#comment-16866066
 ] 

Todd Lipcon commented on IMPALA-7534:
-

Reading back over Paul's analysis here, I think the missing link is that the 
version-numbered cache keys are used for individual objects, but not the higher 
levels in the hierarchy (like table name list and the top-level table object). 
So, this can cause issues like IMPALA-8567 as described above. Assuming a 
starting state where the table name list is not cached:

- Impalad: some select query, which calls loadTableNames(), and sends a request 
to the catlaog
- Catalog: returns a list of tables ['foo'], but the response is still in-flight
- Catalog: someone issues a DDL which creates a table 'bar'. Issues an 
invalidate to all impalads
- Impalad: the loadTableNames() call is still in flight, but receives the 
invalidation via a different thread. The invalidation sees nothing is in the 
cache, so it is ignored.
- Impalad: the loadTableNames() query completes, and the table list ['foo'] is 
cached

This leaves the impalad cache in a persistent incorrect state. New calls to 
loadTableNames() get a cache hit with the incorrect value.

In order to fix this, as discussed in the linked articles, we have a few 
choices:
(1) invalidate can block on any outstanding "loadWithCaching" for the same key, 
and invalidate it after it gets stored in the cache
(2) invalidate can prevent any outstanding "loadWithCaching" from writing back 
its result

Choice 2 is better to avoid blocking between potentially-unrelated operations.

> Handle invalidation races in CatalogdMetaProvider cache
> ---
>
> Key: IMPALA-7534
> URL: https://issues.apache.org/jira/browse/IMPALA-7534
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Not Applicable
>
>
> There is a well-known race in Guava's LoadingCache that we are using for 
> CatalogdMetaProvider which we are not currently handling:
> - thread 1 gets a cache miss and makes a request to fetch some data from the 
> catalogd. It fetches the catalog object with version 1 and then gets context 
> switched out or otherwise slow
> - thread 2 receives an invalidation for the same object, because it has 
> changed to v2. It calls 'invalidate' on the cache, but nothing is yet cached.
> - thread 1 puts back v1 of the object into the cache
> In essence we've "missed" an invalidation. This is also described in this 
> nice post: https://softwaremill.com/race-condition-cache-guava-caffeine/
> The race is quite unlikely but could cause some unexpected results that are 
> hard to reason about, so we should look into a fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Reopened] (IMPALA-7534) Handle invalidation races in CatalogdMetaProvider cache

2019-06-17 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reopened IMPALA-7534:
-
  Assignee: Todd Lipcon  (was: Paul Rogers)

I think the analysis here wasn't quite right and this is responsible for 
IMPALA-8567. If I inject a small sleep after sendRequest() for loadTableNames() 
and run a few threads doing create/describe/drop in a loop I can repro the 
issue described there.

> Handle invalidation races in CatalogdMetaProvider cache
> ---
>
> Key: IMPALA-7534
> URL: https://issues.apache.org/jira/browse/IMPALA-7534
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Not Applicable
>
>
> There is a well-known race in Guava's LoadingCache that we are using for 
> CatalogdMetaProvider which we are not currently handling:
> - thread 1 gets a cache miss and makes a request to fetch some data from the 
> catalogd. It fetches the catalog object with version 1 and then gets context 
> switched out or otherwise slow
> - thread 2 receives an invalidation for the same object, because it has 
> changed to v2. It calls 'invalidate' on the cache, but nothing is yet cached.
> - thread 1 puts back v1 of the object into the cache
> In essence we've "missed" an invalidation. This is also described in this 
> nice post: https://softwaremill.com/race-condition-cache-guava-caffeine/
> The race is quite unlikely but could cause some unexpected results that are 
> hard to reason about, so we should look into a fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8489) TestRecoverPartitions.test_post_invalidate fails with IllegalStateException when HMS polling is enabled

2019-06-14 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8489:
---

Assignee: Vihang Karajgaonkar  (was: Todd Lipcon)

I think Vihang was going to look at this? If not, feel free to bounce back to me

> TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
> when HMS polling is enabled
> ---
>
> Key: IMPALA-8489
> URL: https://issues.apache.org/jira/browse/IMPALA-8489
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> {noformat}
> metadata/test_recover_partitions.py:279: in test_post_invalidate
> "INSERT INTO TABLE %s PARTITION(i=002, p='p2') VALUES(4)" % FQ_TBL_NAME)
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:364: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:385: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:IllegalArgumentException: no such partition id 6244
> {noformat}
> The failure is reproducible for me locally with catalog v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-2649) improve incremental stats scalability

2019-06-14 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-2649.
-
Resolution: Fixed

I'm inclined to call this one fixed after a number of changes last year:
- incremental stats data isn't published as part of the metadata objects to 
impalads, but instead fetched on demand for COMPUTE INCREMENTAL STATS
- incremental stats data are now stored compressed and in byte[] instead of 
base64-encoded Strings
- generally more work went into catalogd memory management, such as automatic 
invalidation when running low on heap, so this won't take down catalogd anymore

> improve incremental stats scalability
> -
>
> Key: IMPALA-2649
> URL: https://issues.apache.org/jira/browse/IMPALA-2649
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.3.0
>Reporter: Silvius Rus
>Priority: Critical
>  Labels: compute-stats
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8514) CatalogException: Error initializing Catalog

2019-06-14 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8514.
-
Resolution: Not A Bug

Please direct questions to the Impala user mailing list

> CatalogException: Error initializing Catalog
> 
>
> Key: IMPALA-8514
> URL: https://issues.apache.org/jira/browse/IMPALA-8514
> Project: IMPALA
>  Issue Type: Question
>  Components: Catalog
>Affects Versions: Impala 3.2.0
> Environment: impala3.2 
> centos7
> hadoop2.6
> hive2.1
>Reporter: Lycan
>Priority: Blocker
>
> When I start calalogd,I got the following problem:
> {code:java}
> E0507 17:29:47.566875 1932 MetaStoreUtils.java:1464] Converting exception to 
> MetaException
> E0507 17:29:47.567422 1932 CatalogServiceCatalog.java:1351] Error 
> initializing Catalog
> Java exception follows:
> MetaException(message:Got exception: 
> org.apache.hadoop.hive.metastore.api.MetaException Failed to load Hive 
> binding null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:1465)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1366)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:150)
> at com.sun.proxy.$Proxy4.getAllDatabases(Unknown Source)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.reset(CatalogServiceCatalog.java:1319)
> at 
> org.apache.impala.service.CatalogOpExecutor.execResetMetadata(CatalogOpExecutor.java:3636)
> at org.apache.impala.service.JniCatalog.resetMetadata(JniCatalog.java:185)
> E0507 17:29:47.567806 1932 catalog-server.cc:122] CatalogException: Error 
> initializing Catalog. Catalog may be empty.
> CAUSED BY: MetaException: Got exception: 
> org.apache.hadoop.hive.metastore.api.MetaException Failed to load Hive 
> binding null
> {code}
> My hive is started up  Normally.
> Any solution is appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-6536) CREATE TABLE on S3 takes a very long time

2019-06-14 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-6536:
---

Assignee: (was: Todd Lipcon)

> CREATE TABLE on S3 takes a very long time
> -
>
> Key: IMPALA-6536
> URL: https://issues.apache.org/jira/browse/IMPALA-6536
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog, Frontend
>Affects Versions: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 3.0, 
> Impala 2.12.0
>Reporter: Alexander Behm
>Priority: Critical
>  Labels: catalog, perfomance, s3
>
> *Summary*
> Creating a table that points to existing data in S3 can take an excessive 
> amount of time.
> *Reason*
> If the Hive Metastore is configured with "hive.stats.autogather=true" then 
> Hive lists the files of newly created tables to populate basic statistics 
> like file count and file byte sizes. Unfortunately, this listing operation 
> can take an excessive amount of time particularly on S3.
> *Workaround*
> * Reconfigure the Hive Metastore with "hive.stats.autogather=false"
> * Note that TBLPROPERTIES("DO_NOT_UPDATE_STATS"="true") does not address the 
> issue due to a bug in Hive
> Related:
> https://issues.apache.org/jira/browse/HIVE-18743
> *Example*
> {code}
> CREATE EXTERNAL TABLE tpch_lineitem_s3 (
>   l_orderkey BIGINT,
>   l_partkey BIGINT,
>   l_suppkey BIGINT,
>   l_linenumber BIGINT,
>   l_quantity DECIMAL(12,2),
>   l_extendedprice DECIMAL(12,2),
>   l_discount DECIMAL(12,2),
>   l_tax DECIMAL(12,2),
>   l_returnflag STRING,
>   l_linestatus STRING,
>   l_shipdate STRING,
>   l_commitdate STRING,
>   l_receiptdate STRING,
>   l_shipinstruct STRING,
>   l_shipmode STRING,
>   l_comment STRING
> )
> STORED AS PARQUET
> LOCATION "s3a://some_location/my_existing_data"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8282) Impala Catalog 'Failed to load metadata for table' and 'GC overhead limit exceeded'

2019-06-14 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8282.
-
Resolution: Invalid

Please use the mailing list for questions like this.

> Impala Catalog 'Failed to load metadata for table' and 'GC overhead limit 
> exceeded'
> ---
>
> Key: IMPALA-8282
> URL: https://issues.apache.org/jira/browse/IMPALA-8282
> Project: IMPALA
>  Issue Type: Question
>  Components: Catalog
>Affects Versions: Impala 2.5.0
> Environment: Centos6.9
>Reporter: Ken
>Priority: Blocker
>
> Hi all
>  Our hive has inner table test1,test2 ,now *we cannot use test2 throw impala 
> anyway*.Can you help me to locate the real causes and do you have some ?
>  details as follow:
> *Cannot execute 'desc test2', 'refresh test2' , 'invalidate metedata test2' , 
> 'select * from test2' commonds throw impala-shell or jdbc connection.* 
>  *But* 
>  *1.we can show table test2 in hive .*
>  *2.we can use other tables(such as test1) normally throw impala-shell or 
> jdbc connection or hive.*
> *exception as follows:*
>  [DEVICE001:21000] > show create table test2;
>  Query: show create table test2
>  ERROR: AnalysisException: java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
>  CAUSED BY: ExecutionException: java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
>  CAUSED BY: OutOfMemoryError: GC overhead limit exceeded
>  CAUSED BY: TableLoadingException: java.lang.OutOfMemoryError: GC overhead 
> limit exceeded
>  CAUSED BY: ExecutionException: java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
>  CAUSED BY: OutOfMemoryError: GC overhead limit exceeded
> [DEVICE001:21000] > select * from test2 limit 1;
>  Query: select * from test2 limit 1
>  ERROR: AnalysisException: Failed to load metadata for table: 'test2'
>  CAUSED BY: TableLoadingException: java.lang.OutOfMemoryError: GC overhead 
> limit exceeded
>  CAUSED BY: ExecutionException: java.lang.OutOfMemoryError: GC overhead limit 
> exceeded
>  CAUSED BY: OutOfMemoryError: GC overhead limit exceeded
> *top:*
>  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
>  9522 impala 20 0 34.1g 21g 29m S 8.0 8.6 1355:06 catalogd 
>  9449 impala 20 0 1793m 499m 10m S 0.3 0.2 2:19.99 statestored
> *free -g*
>  total used free shared buffers cached
>  Mem: 251 107 144 0 0 27
> *Analyzer.java* 
>  I find the exception was thrown in 
> */Impala-cdh5-2.5.0_5.7.0/fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java
>  *
>  public Table getTable(String dbName, String tableName)
>  throws AnalysisException, TableLoadingException {
>  Table table = null;
>  try {
>  table = getCatalog().getTable(dbName, tableName);
>  } catch (DatabaseNotFoundException e) {
>  throw new AnalysisException(DB_DOES_NOT_EXIST_ERROR_MSG + dbName);
>  } catch (CatalogException e) {
>  String errMsg = String.format("Failed to load metadata for table: %s", 
> tableName);
>  // We don't want to log all AnalysisExceptions as ERROR, only failures due to
>  // TableLoadingExceptions.
>  LOG.error(String.format("%s\n%s", errMsg, e.getMessage()));
>  if (e instanceof TableLoadingException) throw (TableLoadingException) e;
>  throw new TableLoadingException(errMsg, e);
>  }
>  if (table == null) {
>  throw new AnalysisException(
>  TBL_DOES_NOT_EXIST_ERROR_MSG + dbName + "." + tableName);
>  }
>  if (!table.isLoaded()) {
>  missingTbls_.add(new TableName(table.getDb().getName(), table.getName()));
>  throw new AnalysisException(
>  "Table/view is missing metadata: " + table.getFullName());
>  }
>  return table;
>  }
> *Now I trid setting* 
>  1.'export JAVA_TOOL_OPTIONS=" -Xmx40g"' , 
>  2. 'IMPALA_CATALOG_ARGS=" -log_dir=${IMPALA_LOG_DIR} -mem_limit=-1b"' 
>  3.'IMPALA_SERVER_ARGS=" -mem_limit=-1b"'
>  but still 'java.lang.OutOfMemoryError' .
> *Can you give me some suggestion ?*
> *Thanks & Best Regards.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8630) Consistent remote placement should include partition information when calculating placement

2019-06-14 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16864358#comment-16864358
 ] 

Todd Lipcon commented on IMPALA-8630:
-

bq. It would be nice to avoid reconstructing the path and hashing it on every 
query
bq. ...not just the partition path but also the filename, it would reduce the 
cost of hashing in the scheduler

Do you have any data to suggest that the reconstructing and hashing woudl be a 
bottleneck? Assuming a very high upper bound million files and 100 bytes each, 
the cost here is hashing 100M of data which should be a few tens of 
milliseconds. For typical queries on tens of thousands of files I can't imagine 
this showing up at all relative to other costs.

bq. We could hash it there and put it in FbFileDesc

that would have a persistent memory cost on the catalogd which seems like 
something we should avoid



> Consistent remote placement should include partition information when 
> calculating placement
> ---
>
> Key: IMPALA-8630
> URL: https://issues.apache.org/jira/browse/IMPALA-8630
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Blocker
>
> For partitioned tables, the actual filenames within partitions may not have 
> large entropy. Impala includes information in its filenames that would not be 
> the same across partitions, but this is common for tables written by the 
> current CDH version of Hive. For example, in our minicluster, the TPC-DS 
> store_sales table has many partitions, but the actual filenames within 
> partitions are very simple:
> {noformat}
> hdfs dfs -ls /test-warehouse/tpcds.store_sales/ss_sold_date_sk=2452642
> Found 1 items
> -rwxr-xr-x 3 joe supergroup 379535 2019-06-05 15:16 
> /test-warehouse/tpcds.store_sales/ss_sold_date_sk=2452642/00_0
> hdfs dfs -ls /test-warehouse/tpcds.store_sales/ss_sold_date_sk=2452640
> Found 1 items
> -rwxr-xr-x 3 joe supergroup 412959 2019-06-05 15:16 
> /test-warehouse/tpcds.store_sales/ss_sold_date_sk=2452640/00_0{noformat}
> Right now, consistent remote placement uses the filename+offset without the 
> partition id.
> {code:java}
> uint32_t hash = HashUtil::Hash(hdfs_file_split->relative_path.data(),
>   hdfs_file_split->relative_path.length(), 0);
> {code}
> This would produce a poor balance of files across nodes when there is low 
> entropy in filenames. This should be amended to include the partition id, 
> which is already accessible on the THdfsFileSplit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8667) Remove --pull_incremental_statistics flag (on by default)

2019-06-14 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8667:
---

 Summary: Remove --pull_incremental_statistics flag (on by default)
 Key: IMPALA-8667
 URL: https://issues.apache.org/jira/browse/IMPALA-8667
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Reporter: Todd Lipcon


This flag was introduced as a "chicken bit" so we could disable it if something 
goes wrong. So far we haven't seen any issues. Let's remove the flag and the 
old code paths so we don't need to maintain them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8661) Create randomized tests for stressing the event processor

2019-06-13 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16863411#comment-16863411
 ] 

Todd Lipcon commented on IMPALA-8661:
-

I think I would lean towards actually issuing Hive queries to try to stress 
these paths as a full end-to-end. Despite being slower than creating our own 
batches manually, will help us catch any breakages on the Hive side, and also 
give us a variety of different interleavings between the notifications arriving 
and the actual underlying HMS changes happening.

> Create randomized tests for stressing the event processor
> -
>
> Key: IMPALA-8661
> URL: https://issues.apache.org/jira/browse/IMPALA-8661
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> We should create pseudo-randomized batches of events to stress event 
> processor so that we can flush out any bugs. The tests could be a junit test 
> which generates a random sized batch with the supported event types. Once the 
> random batch of events are processed, we should validate if the table matches 
> with what is present in HMS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8653) Detect Kudu integration status based on HMS UUID instead of URIs

2019-06-11 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8653:
---

 Summary: Detect Kudu integration status based on HMS UUID instead 
of URIs
 Key: IMPALA-8653
 URL: https://issues.apache.org/jira/browse/IMPALA-8653
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Affects Versions: Impala 3.3.0
Reporter: Todd Lipcon


KUDU-2841 added an API to the Kudu master to determine the HMS UUID that it is 
configured against. Instead of comparing the configured URIs, Impala should 
also fetch the HMS UUID and compare the UUID to the one returned by Kudu to 
ensure that the two systems are talking to the same metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8635) Kudu HMS integration check should not fetch metastore URIs configuration from metastore

2019-06-10 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8635.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Kudu HMS integration check should not fetch metastore URIs configuration from 
> metastore
> ---
>
> Key: IMPALA-8635
> URL: https://issues.apache.org/jira/browse/IMPALA-8635
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: Impala 3.3.0
>
>
> The patch for IMPALA-8504 (part 2) (6bb404dc359) checks to see if Impala and 
> Kudu are configured against the same metastore to determine if the HMS 
> integration is enabled. However, instead of using its own metastore URI 
> config, it uses the configuration stored on the remote HMS. This is error 
> prone because it's not common for the HMS configuration to store its own URI. 
> Instead, we should use our own config, and our fallback should conservatively 
> assume the legacy behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-8459) Cannot delete impala/kudu table if backing kudu table dropped with local catalog

2019-06-07 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8459 started by Todd Lipcon.
---
> Cannot delete impala/kudu table if backing kudu table dropped with local 
> catalog
> 
>
> Key: IMPALA-8459
> URL: https://issues.apache.org/jira/browse/IMPALA-8459
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: kudu
>
> test_delete_external_kudu_table and test_delete_managed_kudu_table fail with 
> local catalog, e.g. with:
> {noformat}
> E   HiveServer2Error: LocalCatalogException: Error opening Kudu table 
> 'testimpalakuduintegration_1715_p3r46w.ogslbjblgv', Kudu error: the table 
> does not exist: table_name: "testimpalakuduintegration_1715_p3r46w.ogslbjblgv"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8606) GET_TABLES performance in local catalog mode

2019-06-07 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16858405#comment-16858405
 ] 

Todd Lipcon commented on IMPALA-8606:
-

Seems we need to add some interface like FeCatalog.getTableIfCached() which 
returns the table object if it's already resident, or otherwise avoids doing 
any round trips.

That said, the user visible behavior here ends up a bit goofy -- stuff like 
comments are silently missing for unloaded tables. On catalog v1 that was 
already the case, but given v1 was much more eager about caching, it would be 
less likely to be visible. For v2 people are more likely to notice. Any ideas?

> GET_TABLES performance in local catalog mode
> 
>
> Key: IMPALA-8606
> URL: https://issues.apache.org/jira/browse/IMPALA-8606
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Balazs Jeszenszky
>Assignee: Todd Lipcon
>Priority: Critical
>
> With local catalog mode enabled, GET_TABLES JDBC requests will return more 
> than the always available table information. Any request for more metadata 
> about a table will trigger a full load of that table on the catalogd side, 
> meaning that GET_TABLES triggers the load of the entire catalog. Also, as far 
> as I can see, the requests for more metadata are made one table at a time. 
> Once the tables are loaded, the coordinator needs 3 roundtrips to the catalog 
> to fetch all the details about a single table. My test case had around 57k 
> tables, 1700 DBs, and ~120k partitions. 
> GET_TABLES on a cold catalog takes 18 minutes. With a warm catalog, but cold 
> impalad, it still takes ~70 seconds.
> Many tools use GET_TABLES to populate dropdowns, etc. so this is bad for both 
> end user experience and catalog memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8606) GET_TABLES performance in local catalog mode

2019-06-07 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8606:
---

Assignee: Todd Lipcon

> GET_TABLES performance in local catalog mode
> 
>
> Key: IMPALA-8606
> URL: https://issues.apache.org/jira/browse/IMPALA-8606
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Balazs Jeszenszky
>Assignee: Todd Lipcon
>Priority: Critical
>
> With local catalog mode enabled, GET_TABLES JDBC requests will return more 
> than the always available table information. Any request for more metadata 
> about a table will trigger a full load of that table on the catalogd side, 
> meaning that GET_TABLES triggers the load of the entire catalog. Also, as far 
> as I can see, the requests for more metadata are made one table at a time. 
> Once the tables are loaded, the coordinator needs 3 roundtrips to the catalog 
> to fetch all the details about a single table. My test case had around 57k 
> tables, 1700 DBs, and ~120k partitions. 
> GET_TABLES on a cold catalog takes 18 minutes. With a warm catalog, but cold 
> impalad, it still takes ~70 seconds.
> Many tools use GET_TABLES to populate dropdowns, etc. so this is bad for both 
> end user experience and catalog memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8635) Kudu HMS integration check should not fetch metastore URIs configuration from metastore

2019-06-07 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8635:
---

 Summary: Kudu HMS integration check should not fetch metastore 
URIs configuration from metastore
 Key: IMPALA-8635
 URL: https://issues.apache.org/jira/browse/IMPALA-8635
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.3.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon


The patch for IMPALA-8504 (part 2) (6bb404dc359) checks to see if Impala and 
Kudu are configured against the same metastore to determine if the HMS 
integration is enabled. However, instead of using its own metastore URI config, 
it uses the configuration stored on the remote HMS. This is error prone because 
it's not common for the HMS configuration to store its own URI. Instead, we 
should use our own config, and our fallback should conservatively assume the 
legacy behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8459) Cannot delete impala/kudu table if backing kudu table dropped with local catalog

2019-05-31 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853445#comment-16853445
 ] 

Todd Lipcon commented on IMPALA-8459:
-

One possibility for a workaround: could you alter the storage engine of the 
table to be non-kudu before dropping? We may prevent ALTER from changing the 
storage handler, but if not, it's a thought.

Another idea (though not a very nice one) is to use the Kudu client to recreate 
the table (with arbitrary schema and no contents) and then drop it via Impala? 
eg 'kudu perf loadgen -table_name=impala::foo.bar --keep_auto_table' is an easy 
way to create a small table from the CLI.



> Cannot delete impala/kudu table if backing kudu table dropped with local 
> catalog
> 
>
> Key: IMPALA-8459
> URL: https://issues.apache.org/jira/browse/IMPALA-8459
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: kudu
>
> test_delete_external_kudu_table and test_delete_managed_kudu_table fail with 
> local catalog, e.g. with:
> {noformat}
> E   HiveServer2Error: LocalCatalogException: Error opening Kudu table 
> 'testimpalakuduintegration_1715_p3r46w.ogslbjblgv', Kudu error: the table 
> does not exist: table_name: "testimpalakuduintegration_1715_p3r46w.ogslbjblgv"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8489) TestRecoverPartitions.test_post_invalidate fails with IllegalStateException when HMS polling is enabled

2019-05-29 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-8489:

Summary: TestRecoverPartitions.test_post_invalidate fails with 
IllegalStateException when HMS polling is enabled  (was: 
TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
with local catalog)

I can reproduce this by just enabling polling (and not LocalCatalog). Updated 
the title appropriately.

The issue seems to be in CatalogOpExecutor.updateCatalog handling of partitions 
that were touched by an insert. It comes up with a list of partition IDs that 
were modified by the insert, then calls loadTableMetadata() which refreshes 
those partitions. Because the partition was added by ALTER TABLE RECOVER 
PARTITIONS, it got marked as "dirty" which means that the refresh ends up 
dropping and reloading it with a new partition ID. Then, createInsertEvents 
looks for the partitions by ID, but they've since been assigned new IDs, so 
they aren't found.

Digging into this a bit more to see if I can see why this affects this code 
path but not others that also use the "dirty partition" hack

> TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
> when HMS polling is enabled
> ---
>
> Key: IMPALA-8489
> URL: https://issues.apache.org/jira/browse/IMPALA-8489
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> {noformat}
> metadata/test_recover_partitions.py:279: in test_post_invalidate
> "INSERT INTO TABLE %s PARTITION(i=002, p='p2') VALUES(4)" % FQ_TBL_NAME)
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:364: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:385: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:IllegalArgumentException: no such partition id 6244
> {noformat}
> The failure is reproducible for me locally with catalog v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8489) TestRecoverPartitions.test_post_invalidate fails with IllegalStateException with local catalog

2019-05-29 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851376#comment-16851376
 ] 

Todd Lipcon commented on IMPALA-8489:
-

Ah, it seems this is due to --hms_event_polling_interval_s=1 rather than local 
catalog (I can repro with polling enabled, but if I turn off polling and keep 
localcatalog, it passes). Taking a look.

> TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
> with local catalog
> --
>
> Key: IMPALA-8489
> URL: https://issues.apache.org/jira/browse/IMPALA-8489
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> {noformat}
> metadata/test_recover_partitions.py:279: in test_post_invalidate
> "INSERT INTO TABLE %s PARTITION(i=002, p='p2') VALUES(4)" % FQ_TBL_NAME)
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:364: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:385: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:IllegalArgumentException: no such partition id 6244
> {noformat}
> The failure is reproducible for me locally with catalog v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8594) Support drop table for external kudu tables that are dropped in kudu

2019-05-29 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851017#comment-16851017
 ] 

Todd Lipcon commented on IMPALA-8594:
-

Did you hit this with LocalCatalog enabled or without? IMPALA-8459 is a known 
issue with LocalCatalog that's on my todo list to address, but didn't think 
this was an issue for Catalog V1

> Support drop table for external kudu tables that are dropped in kudu
> 
>
> Key: IMPALA-8594
> URL: https://issues.apache.org/jira/browse/IMPALA-8594
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.1.0
>Reporter: Manish Maheshwari
>Priority: Critical
>
> External kudu tables in Impala cannot be dropped from HMS if the kudu table 
> is already dropped in kudu. This cases HMS to be out of sync with kudu 
> metadata.
> Impala should clean up HMS table info when a drop is executed for an external 
> table that does not exist in kudu
>  
> cc - [~balazsj_impala_220b] [~tlipcon]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8458) Can't set numNull/maxSize/avgSize column stats with local catalog without also setting NDV

2019-05-22 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8458.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Can't set numNull/maxSize/avgSize column stats with local catalog without 
> also setting NDV
> --
>
> Key: IMPALA-8458
> URL: https://issues.apache.org/jira/browse/IMPALA-8458
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: Impala 3.3.0
>
>
> Repro:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> create table test_stats2(s 
> string);
> +-+
> | summary |
> +-+
> | Table has been created. |
> +-+
> Fetched 1 row(s) in 0.36s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('avgSize'='1234');
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.14s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('maxSize'='1234');
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.10s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> invalidate metadata 
> test_stats2;
> Fetched 0 row(s) in 0.03s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> Query: show column stats test_stats2
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.07s
> {noformat}
> I expected that the updates would take effect. Weirdly it doesn't happen for 
> NDV and NULLS:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('numDVs'='1234','numNulls'='12345');
> Query: alter table test_stats2 set column stats 
> s('numDVs'='1234','numNulls'='12345')
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.12s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> Query: show column stats test_stats2
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | 1234 | 12345  | -1   | -1   |
> +++--++--+--+
>

[jira] [Created] (IMPALA-8569) Periodically scrub deleted files from the file handle cache

2019-05-21 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8569:
---

 Summary: Periodically scrub deleted files from the file handle 
cache
 Key: IMPALA-8569
 URL: https://issues.apache.org/jira/browse/IMPALA-8569
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Todd Lipcon


Currently, if you query a file, and then later delete that file (eg drop the 
partition or table), the file will still stay in the impalad's file handle 
cache. Because the file is open, the space can't be reclaimed on disk until the 
impalad restarts or churns through its cache enough to drop the handle.

Typically this isn't a big deal in practice, since most files don't get deleted 
shortly after being read, and the FH cache should cycle through after 6 hours 
by default. Additionally, fixing it would be a bit of a pain since we'd need to 
add HDFS and libhdfs hooks to get HDFS to tell us if the underlying short 
circuit FD is unlinked, which probably also means adding JNI code to let Java 
call to fstat() in order to check st_nlink. Given that, I'm not sure it's worth 
fixing, or if we should just consider a shorter default expiry on the FH cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8544) Expose additional S3A / S3Guard metrics

2019-05-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845479#comment-16845479
 ] 

Todd Lipcon commented on IMPALA-8544:
-

I think the problem with using the global metrics is that they are only pushed 
up to the global counters when a stream is closed. Impala now keeps open file 
handles to S3. Perhaps we can make unbuffer() also push the counters to the 
global store?

bq. the fields are all non-atomic, non-volatile values so that the cost of 
incrementing them is ~0. If things are being collected, that may change.

I don't think that's necessarily the case. Given they are counters, I'm sure 
we're OK with slight raciness on read. We could always use Unsafe or VarHandle 
to do barrier-less reads and writes knowing that we're OK without those 
barriers.

> Expose additional S3A / S3Guard metrics
> ---
>
> Key: IMPALA-8544
> URL: https://issues.apache.org/jira/browse/IMPALA-8544
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: s3
>
> S3A / S3Guard internally collects several useful metrics that we should 
> consider exposing to Impala users. The full list of statistics can be found 
> in {{o.a.h.fs.s3a.Statistic}}. The stats include: the number of S3 operations 
> performed (put, get, etc.), invocation counts for various {{FileSystem}} 
> methods, stream statistics (bytes read, written, etc.), etc.
> Some interesting stats that stand out:
>  * "stream_aborted": "Count of times the TCP stream was aborted" - the number 
> of TCP connection aborts, a high value would indicate performance issues
>  * "stream_read_exceptions" : "Number of exceptions invoked on input streams" 
> - incremented whenever an {{IOException}} is caught while reading (these 
> exception don't always get propagated to Impala because they trigger a retry)
>  * "store_io_throttled": "Requests throttled and retried" - looks like it 
> tracks the number of times the fs retries an operation because the original 
> request hit a throttling exception
>  * "s3guard_metadatastore_retry": "S3Guard metadata store retry events" - 
> looks like it tracks the number of times the fs retries S3Guard operations
>  * "s3guard_metadatastore_throttled" : "S3Guard metadata store throttled 
> events" - similar to "store_io_throttled" but looks like it is specific to 
> S3Guard
> We should consider how to expose these metrics via Impala logs / runtime 
> profiles.
> There are a few options:
>  * {{S3AFileSystem}} exposes {{StorageStatistics}} specific to S3A / S3Guard 
> via the {{FileSystem#getStorageStatistics}} method; the 
> {{S3AStorageStatistics}} seems to include all the S3A / S3Guard metrics, 
> however, I think the stats might be aggregated globally, which would make it 
> hard to create per-query specific metrics
>  * {{S3AInstrumentation}} exposes all the metrics as well, and looks like it 
> is per-fs instance, so it is not aggregated globally; {{S3AInstrumentation}} 
> extends {{o.a.h.metrics2.MetricsSource}} so perhaps it is exposed via some 
> API (haven't looked into this yet)
>  * {{S3AInputStream#toString}} dumps the statistics from 
> {{o.a.h.fs.s3a.S3AInstrumentation.InputStreamStatistics}} and 
> {{S3AFileSystem#toString}} dumps them all as well
>  * {{S3AFileSystem}} updates the stats in 
> {{o.a.h.fs.Statistics.StatisticsData}} as well (e.g. bytesRead, bytesWritten, 
> etc.)
> Impala has a {{hdfs-fs-cache}} as well, so {{hdfsFs}} objects get shared 
> across threads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8566) COMPUTE INCREMENTAL STATS sets num_nulls off-by-one

2019-05-21 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8566.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> COMPUTE INCREMENTAL STATS sets num_nulls off-by-one
> ---
>
> Key: IMPALA-8566
> URL: https://issues.apache.org/jira/browse/IMPALA-8566
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> IMPALA-7659 added the population of NULL counts while computing stats, but 
> this functionality isn't working properly for incremental stats. The query is 
> produced correctly, but the null count set in the table is one lower than it 
> should be. In the case that the table has no nulls, this ends up setting a 
> '-1' count, which is interpreted as 'unknown'. In the case that there are 
> nulls, we'll just be a little inaccurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6876) Entries in CatalogUsageMonitor are not cleared after invalidation

2019-05-21 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-6876:

Labels: memory-leak ramp-up  (was: memory-leak)

> Entries in CatalogUsageMonitor are not cleared after invalidation
> -
>
> Key: IMPALA-6876
> URL: https://issues.apache.org/jira/browse/IMPALA-6876
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Priority: Major
>  Labels: memory-leak, ramp-up
>
> The CatalogUsageMonitor in the catalog maintains a small cache of references 
> to tables that: a) are accessed frequently in the catalog and b) have the 
> highest memory requirements. These entries are not cleared upon server or 
> table invalidation, thus preventing the GC from collecting the memory of 
> these tables. We should make sure that the CatalogUsageMonitor does not 
> maintain entries of tables that have been invalidated or deleted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6876) Entries in CatalogUsageMonitor are not cleared after invalidation

2019-05-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844538#comment-16844538
 ] 

Todd Lipcon commented on IMPALA-6876:
-

Seems like the simplest fix here would be to use weak references from this data 
structure

> Entries in CatalogUsageMonitor are not cleared after invalidation
> -
>
> Key: IMPALA-6876
> URL: https://issues.apache.org/jira/browse/IMPALA-6876
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Priority: Major
>  Labels: memory-leak, ramp-up
>
> The CatalogUsageMonitor in the catalog maintains a small cache of references 
> to tables that: a) are accessed frequently in the catalog and b) have the 
> highest memory requirements. These entries are not cleared upon server or 
> table invalidation, thus preventing the GC from collecting the memory of 
> these tables. We should make sure that the CatalogUsageMonitor does not 
> maintain entries of tables that have been invalidated or deleted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8383) Bump toolchain version

2019-05-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844536#comment-16844536
 ] 

Todd Lipcon commented on IMPALA-8383:
-

[~hacosta] is this still an issue? Seems it's been bumped several times since 
this was filed.

> Bump toolchain version
> --
>
> Key: IMPALA-8383
> URL: https://issues.apache.org/jira/browse/IMPALA-8383
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Hector Acosta
>Priority: Major
>
> The current $IMPALA_TOOLCHAIN_BUILD_ID has a bug where the fastbinary shared 
> object is missing for some distributions. We should bump the version to an id 
> that includes fastbinary.so.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8529) ccache is ignored when using ninja generator

2019-05-21 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8529.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

Seems this works now after we upgraded to CMake 3.14, even without changes to 
CMakeFiles (at least when I run ninja -v I see ccache in use)

> ccache is ignored when using ninja generator
> 
>
> Key: IMPALA-8529
> URL: https://issues.apache.org/jira/browse/IMPALA-8529
> Project: IMPALA
>  Issue Type: Task
>Reporter: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> The CMakeLists.txt sets up ccache by using RULE_LAUNCH_PREFIX, which is only 
> respected by the Makefile generator. So, if we use ninja (which is generally 
> better at parallelism) then ccache won't kick in. Newer versions of cmake 
> have more explicit support for ccache that ought to also work with the ninja 
> generator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7799) Store periodic snapshots of Impalad metrics

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844526#comment-16844526
 ] 

Todd Lipcon commented on IMPALA-7799:
-

Before reinventing this wheel please take a look at diagnostics_log.cc in Kudu. 
We do exactly this.

> Store periodic snapshots of Impalad metrics
> ---
>
> Key: IMPALA-7799
> URL: https://issues.apache.org/jira/browse/IMPALA-7799
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0
>Reporter: Michael Ho
>Priority: Critical
>
> Currently, each Impala demon exposes a set of metrics exposed via the debug 
> webpage metrics page. While this may be very helpful for development, there 
> are many incidents in which one may want to record these metrics to do 
> postmortem analysis for various issues.
> We should consider taking periodic snapshots of these Impalad metrics and 
> archive them for a certain retention period to allow for postmortem analysis. 
> We need to be mindful of the space usage concern (e.g. using compressed json 
> ?)
> To enable easier analysis, one may need to build a tool (or use some existing 
> off-the-shell libraries) to show the collected snapshots as time series and 
> calculate various statistics (e.g. mean, median, min, max etc).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8544) Expose additional S3A / S3Guard metrics

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844525#comment-16844525
 ] 

Todd Lipcon commented on IMPALA-8544:
-

[~mackrorysd] is there any way we could enhance these stats to be per-stream or 
threadlocal? (even if we had to downcast to S3AInputStream or something). It 
seems like this would be vastly preferable to FS-wide or global counters.

Looking briefly at S3AInputStream there is already this method:
{code:java}
  /**
   * Access the input stream statistics.
   * This is for internal testing and may be removed without warning.
   * @return the statistics for this input stream
   */
  @InterfaceAudience.Private
  @InterfaceStability.Unstable
  public S3AInstrumentation.InputStreamStatistics getS3AStreamStatistics() {
return streamStatistics;
  }
{code}

which seems to be pretty much what we'd need. We'd just want to change that 
comment and audience annotation to be a little less private :)

> Expose additional S3A / S3Guard metrics
> ---
>
> Key: IMPALA-8544
> URL: https://issues.apache.org/jira/browse/IMPALA-8544
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: s3
>
> S3A / S3Guard internally collects several useful metrics that we should 
> consider exposing to Impala users. The full list of statistics can be found 
> in {{o.a.h.fs.s3a.Statistic}}. The stats include: the number of S3 operations 
> performed (put, get, etc.), invocation counts for various {{FileSystem}} 
> methods, stream statistics (bytes read, written, etc.), etc.
> Some interesting stats that stand out:
>  * "stream_aborted": "Count of times the TCP stream was aborted" - the number 
> of TCP connection aborts, a high value would indicate performance issues
>  * "stream_read_exceptions" : "Number of exceptions invoked on input streams" 
> - incremented whenever an {{IOException}} is caught while reading (these 
> exception don't always get propagated to Impala because they trigger a retry)
>  * "store_io_throttled": "Requests throttled and retried" - looks like it 
> tracks the number of times the fs retries an operation because the original 
> request hit a throttling exception
>  * "s3guard_metadatastore_retry": "S3Guard metadata store retry events" - 
> looks like it tracks the number of times the fs retries S3Guard operations
>  * "s3guard_metadatastore_throttled" : "S3Guard metadata store throttled 
> events" - similar to "store_io_throttled" but looks like it is specific to 
> S3Guard
> We should consider how to expose these metrics via Impala logs / runtime 
> profiles.
> There are a few options:
>  * {{S3AFileSystem}} exposes {{StorageStatistics}} specific to S3A / S3Guard 
> via the {{FileSystem#getStorageStatistics}} method; the 
> {{S3AStorageStatistics}} seems to include all the S3A / S3Guard metrics, 
> however, I think the stats might be aggregated globally, which would make it 
> hard to create per-query specific metrics
>  * {{S3AInstrumentation}} exposes all the metrics as well, and looks like it 
> is per-fs instance, so it is not aggregated globally; {{S3AInstrumentation}} 
> extends {{o.a.h.metrics2.MetricsSource}} so perhaps it is exposed via some 
> API (haven't looked into this yet)
>  * {{S3AInputStream#toString}} dumps the statistics from 
> {{o.a.h.fs.s3a.S3AInstrumentation.InputStreamStatistics}} and 
> {{S3AFileSystem#toString}} dumps them all as well
>  * {{S3AFileSystem}} updates the stats in 
> {{o.a.h.fs.Statistics.StatisticsData}} as well (e.g. bytesRead, bytesWritten, 
> etc.)
> Impala has a {{hdfs-fs-cache}} as well, so {{hdfsFs}} objects get shared 
> across threads.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8566) COMPUTE INCREMENTAL STATS sets num_nulls off-by-one

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844274#comment-16844274
 ] 

Todd Lipcon commented on IMPALA-8566:
-

The issue is the intialization of PerColumnStats here:
{code:java}
PerColumnStats()
: intermediate_ndv(AggregateFunctions::HLL_LEN, 0), num_nulls(-1),
max_width(0), num_rows(0), avg_width(0) { }
{code}
 

Initializing num_nulls to {{-1}} means we end up off by one in the end result.

> COMPUTE INCREMENTAL STATS sets num_nulls off-by-one
> ---
>
> Key: IMPALA-8566
> URL: https://issues.apache.org/jira/browse/IMPALA-8566
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> IMPALA-7659 added the population of NULL counts while computing stats, but 
> this functionality isn't working properly for incremental stats. The query is 
> produced correctly, but the null count set in the table is one lower than it 
> should be. In the case that the table has no nulls, this ends up setting a 
> '-1' count, which is interpreted as 'unknown'. In the case that there are 
> nulls, we'll just be a little inaccurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8566) COMPUTE INCREMENTAL STATS sets num_nulls off-by-one

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-8566:

Description: IMPALA-7659 added the population of NULL counts while 
computing stats, but this functionality isn't working properly for incremental 
stats. The query is produced correctly, but the null count set in the table is 
one lower than it should be. In the case that the table has no nulls, this ends 
up setting a '-1' count, which is interpreted as 'unknown'. In the case that 
there are nulls, we'll just be a little inaccurate.  (was: IMPALA-7659 added 
the population of NULL counts while computing stats, but this functionality 
isn't working properly for incremental stats. The query is produced correctly, 
but the null count isn't being properly propagated back to the table.)
Summary: COMPUTE INCREMENTAL STATS sets num_nulls off-by-one  (was: 
COMPUTE INCREMENTAL STATS does not set num_nulls)

> COMPUTE INCREMENTAL STATS sets num_nulls off-by-one
> ---
>
> Key: IMPALA-8566
> URL: https://issues.apache.org/jira/browse/IMPALA-8566
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> IMPALA-7659 added the population of NULL counts while computing stats, but 
> this functionality isn't working properly for incremental stats. The query is 
> produced correctly, but the null count set in the table is one lower than it 
> should be. In the case that the table has no nulls, this ends up setting a 
> '-1' count, which is interpreted as 'unknown'. In the case that there are 
> nulls, we'll just be a little inaccurate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8566) COMPUTE INCREMENTAL STATS does not set num_nulls

2019-05-20 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8566:
---

 Summary: COMPUTE INCREMENTAL STATS does not set num_nulls
 Key: IMPALA-8566
 URL: https://issues.apache.org/jira/browse/IMPALA-8566
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon


IMPALA-7659 added the population of NULL counts while computing stats, but this 
functionality isn't working properly for incremental stats. The query is 
produced correctly, but the null count isn't being properly propagated back to 
the table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Comment Edited] (IMPALA-8458) Can't set numNull/maxSize/avgSize column stats with local catalog without also setting NDV

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844229#comment-16844229
 ] 

Todd Lipcon edited comment on IMPALA-8458 at 5/20/19 7:21 PM:
--

I paged this code back into my head and remember why we had the weird 
workaround. The hack there was to deal with our odd handling of boolean stats. 
The LocalCatalog flow is:

- catalogd fetches stats from Hive, and converts them to our own internal 
ColumnStats object via ColumnStats.update:

{code}
  BooleanColumnStatsData boolStats = statsData.getBooleanStats();
  numNulls_ = boolStats.getNumNulls();
  numDistinctValues_ = (numNulls_ > 0) ? 3 : 2;
{code}

- impalad fetches stats from catalogd in CatalogdMetaProvider. This interface 
was originally built towards the "fetch directly from HMS" code path, so in 
this case, the wire protocol consists of the catalogd needing to send back the 
Hive ColumnStatitisticsObj type. So, we call 
ColumnStats.createHiveColStatsData() to convert the bool stats back to the Hive 
type:
{code}
  case BOOLEAN:
colStatsData.setBooleanStats(new BooleanColumnStatsData(1, -1, 
numNulls));
break;
{code}

When this hive object gets to the Impalad, it gets converted _back_ to Impala's 
ColumnStats type with the first code snippet above.


This Hive->Impala->Hive->Impala conversion round tripping is somewhat lossy, 
particularly for bools since Hive stores a numFalse/numTrue whereas we want to 
have an NDV. I think we also end up with "lossiness" in the case that we didn't 
find compatible stats in the HMS, since we don't really have a clear 
distinction from "we have stats with unknown NDV" vs "we dont' have stats at 
all".

I'll see if I can clean this up. Perhaps the easiest route is to have the 
wire-protocol for fetch-from-catalogd just use the impala-internal stats object.



was (Author: tlipcon):
I paged this code back into my head and remember why we had the weird 
workaround. The hack there was to deal with our odd handling of boolean stats. 
The LocalCatalog flow is:

- catalogd fetches stats from Hive, and converts them to our own internal 
ColumnStats object via ColumnStats.update:{code}
{code}
  BooleanColumnStatsData boolStats = statsData.getBooleanStats();
  numNulls_ = boolStats.getNumNulls();
  numDistinctValues_ = (numNulls_ > 0) ? 3 : 2;
{code}
- impalad fetches stats from catalogd in CatalogdMetaProvider. This interface 
was originally built towards the "fetch directly from HMS" code path, so in 
this case, the wire protocol consists of the catalogd needing to send back the 
Hive ColumnStatitisticsObj type. So, we call 
ColumnStats.createHiveColStatsData() to convert the bool stats back to the Hive 
type:
{code}
  case BOOLEAN:
colStatsData.setBooleanStats(new BooleanColumnStatsData(1, -1, 
numNulls));
break;
{code}

When this hive object gets to the Impalad, it gets converted _back_ to Impala's 
ColumnStats type with the first code snippet above.


This Hive->Impala->Hive->Impala conversion round tripping is somewhat lossy, 
particularly for bools since Hive stores a numFalse/numTrue whereas we want to 
have an NDV. I think we also end up with "lossiness" in the case that we didn't 
find compatible stats in the HMS, since we don't really have a clear 
distinction from "we have stats with unknown NDV" vs "we dont' have stats at 
all".

I'll see if I can clean this up. Perhaps the easiest route is to have the 
wire-protocol for fetch-from-catalogd just use the impala-internal stats object.


> Can't set numNull/maxSize/avgSize column stats with local catalog without 
> also setting NDV
> --
>
> Key: IMPALA-8458
> URL: https://issues.apache.org/jira/browse/IMPALA-8458
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> Repro:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> create table test_stats2(s 
> string);
> +-+
> | summary |
> +-+
> | Table has been created. |
> +-+
> Fetched 1 row(s) in 0.36s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
>

[jira] [Commented] (IMPALA-8458) Can't set numNull/maxSize/avgSize column stats with local catalog without also setting NDV

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844229#comment-16844229
 ] 

Todd Lipcon commented on IMPALA-8458:
-

I paged this code back into my head and remember why we had the weird 
workaround. The hack there was to deal with our odd handling of boolean stats. 
The LocalCatalog flow is:

- catalogd fetches stats from Hive, and converts them to our own internal 
ColumnStats object via ColumnStats.update:{code}
{code}
  BooleanColumnStatsData boolStats = statsData.getBooleanStats();
  numNulls_ = boolStats.getNumNulls();
  numDistinctValues_ = (numNulls_ > 0) ? 3 : 2;
{code}
- impalad fetches stats from catalogd in CatalogdMetaProvider. This interface 
was originally built towards the "fetch directly from HMS" code path, so in 
this case, the wire protocol consists of the catalogd needing to send back the 
Hive ColumnStatitisticsObj type. So, we call 
ColumnStats.createHiveColStatsData() to convert the bool stats back to the Hive 
type:
{code}
  case BOOLEAN:
colStatsData.setBooleanStats(new BooleanColumnStatsData(1, -1, 
numNulls));
break;
{code}

When this hive object gets to the Impalad, it gets converted _back_ to Impala's 
ColumnStats type with the first code snippet above.


This Hive->Impala->Hive->Impala conversion round tripping is somewhat lossy, 
particularly for bools since Hive stores a numFalse/numTrue whereas we want to 
have an NDV. I think we also end up with "lossiness" in the case that we didn't 
find compatible stats in the HMS, since we don't really have a clear 
distinction from "we have stats with unknown NDV" vs "we dont' have stats at 
all".

I'll see if I can clean this up. Perhaps the easiest route is to have the 
wire-protocol for fetch-from-catalogd just use the impala-internal stats object.


> Can't set numNull/maxSize/avgSize column stats with local catalog without 
> also setting NDV
> --
>
> Key: IMPALA-8458
> URL: https://issues.apache.org/jira/browse/IMPALA-8458
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> Repro:
> {noformat}
> [tarmstrong-box2.ca.cloudera.com:21000] default> create table test_stats2(s 
> string);
> +-+
> | summary |
> +-+
> | Table has been created. |
> +-+
> Fetched 1 row(s) in 0.36s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('avgSize'='1234');
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.14s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> alter table test_stats2 set 
> column stats s('maxSize'='1234');
> +-+
> | summary |
> +-+
> | Updated 0 partition(s) and 1 column(s). |
> +-+
> Fetched 1 row(s) in 0.10s
> [tarmstrong-box2.ca.cloudera.com:21000] default> show column stats 
> test_stats2;
> +++--++--+--+
> | Column | Type   | #Distinct Values | #Nulls | Max Size | Avg Size |
> +++--++--+--+
> | s  | STRING | -1   | -1 | -1   | -1   |
> +++--++--+--+
> Fetched 1 row(s) in 0.02s
> [tarmstrong-box2.ca.cloudera.com:21000] default> invalidate metadata 
> test_stats2;
>

[jira] [Assigned] (IMPALA-7131) Support external data sources without catalogd

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-7131:
---

Assignee: (was: Todd Lipcon)

Not actively working on this. This seems like a very niche feature so maybe we 
can deprecate it in the new catalog mode.

> Support external data sources without catalogd
> --
>
> Key: IMPALA-7131
> URL: https://issues.apache.org/jira/browse/IMPALA-7131
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Todd Lipcon
>Priority: Minor
>
> Currently it seems that external data sources are not persisted except in 
> memory on the catalogd. This means that it will be somewhat more difficult to 
> support this feature in the design of impalad without a catalogd.
> This JIRA is to eventually figure out a way to support this feature -- either 
> by supporting in-memory on a per-impalad basis, or perhaps by figuring out a 
> way to register them persistently in a file system directory, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8438) List valid writeIds for a ACID table

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8438.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> List valid writeIds for a ACID table
> 
>
> Key: IMPALA-8438
> URL: https://issues.apache.org/jira/browse/IMPALA-8438
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Critical
>  Labels: impala-acid
> Fix For: Impala 3.3.0
>
>
> Before listing the partitions of a table, fetch and store the list of valid 
> (committed) writeIds for the table. This will be used later during 
> planning/refresh.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7957) UNION ALL query returns incorrect results

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-7957:
---

Assignee: Quanlong Huang  (was: Paul Rogers)

> UNION ALL query returns incorrect results
> -
>
> Key: IMPALA-7957
> URL: https://issues.apache.org/jira/browse/IMPALA-7957
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Quanlong Huang
>Priority: Blocker
>  Labels: correctness
>
> Synopsis:
> =
> UNION ALL query returns incorrect results
> Problem:
> 
> Customer reported a UNION ALL query returning incorrect results. The UNION 
> ALL query has 2 legs, but Impala is only returning information from one leg.
> Issue can be reproduced in the latest version of Impala. Below is the 
> reproduction case:
> {noformat}
> create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
> insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
> insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
> insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);
> SELECT t.c1
> FROM
>  (SELECT c1, c2
>  FROM mytest_t) t
> LEFT JOIN
>  (SELECT c1, c2
>  FROM mytest_t
>  WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
> UNION ALL
> VALUES (NULL)
> {noformat}
> The above query produces the following execution plan:
> {noformat}
> ++
> | Explain String  
>|
> ++
> | Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 
>|
> | Per-Host Resource Estimates: Memory=2.06GB  
>|
> | WARNING: The following tables are missing relevant table and/or column 
> statistics. |
> | default.mytest_t
>|
> | 
>|
> | PLAN-ROOT SINK  
>|
> | |   
>|
> | 06:EXCHANGE [UNPARTITIONED] 
>|
> | |   
>|
> | 00:UNION
>|
> | |  constant-operands=1  
>|
> | |   
>|
> | 04:SELECT   
>|
> | |  predicates: default.mytest_t.c1 = default.mytest_t.c2
>|
> | |   
>|
> | 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST]   
>|
> | |  hash predicates: c2 = c2 
>|
> | |   
>|
> | |--05:EXCHANGE [BROADCAST]  
>|
> | |  |
>|
> | |  02:SCAN HDFS [default.mytest_t]  
>|
> | | partitions=1/1 files=3 size=192B  
>|
> | | predicates: c2 = c1   
>|
> | |   
>|
> | 01:SCAN HDFS [default.mytest_t] 
>|
> |partitions=1/1 files=3 size=192B 
>|
> ++
> {noformat}
> The issue is in operator 4:
> {noformat}
> | 04:SELECT |
> | | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
> {noformat}
> It's definitely a bug with predicate placement - that c1 = c2 predicate 
> shouldn't be evaluated outside the right branch of the LEFT JOIN.
> Thanks,
> Luis Martinez.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8562) Data cache should skip scan range with mtime == -1

2019-05-17 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842400#comment-16842400
 ] 

Todd Lipcon commented on IMPALA-8562:
-

Perhaps we should add a CHECK_GE(mtime, 0) to try to prevent these issues in 
the future? If we just skip caching, then this bug will end up being 
ineffective cache misses.

Another improvement possible here is to make use of HDFS "fileIds" which are 
unique inode numbers

> Data cache should skip scan range with mtime == -1
> --
>
> Key: IMPALA-8562
> URL: https://issues.apache.org/jira/browse/IMPALA-8562
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Michael Ho
>Assignee: Michael Ho
>Priority: Blocker
>
> As show in IMPALA-8561, using mtime == -1 as part of cache key may lead to 
> reading stale data. Data cache should probably just skip caching those 
> entries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8489) TestRecoverPartitions.test_post_invalidate fails with IllegalStateException with local catalog

2019-05-15 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840713#comment-16840713
 ] 

Todd Lipcon commented on IMPALA-8489:
-

I'm having trouble reproducing this. Was this flaky for you? Or consistently 
failing? Were you using Hive 2 or Hive 3 for this test?

> TestRecoverPartitions.test_post_invalidate fails with IllegalStateException 
> with local catalog
> --
>
> Key: IMPALA-8489
> URL: https://issues.apache.org/jira/browse/IMPALA-8489
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.3.0
>Reporter: Tim Armstrong
>Assignee: Todd Lipcon
>Priority: Critical
>
> {noformat}
> metadata/test_recover_partitions.py:279: in test_post_invalidate
> "INSERT INTO TABLE %s PARTITION(i=002, p='p2') VALUES(4)" % FQ_TBL_NAME)
> common/impala_test_suite.py:620: in wrapper
> return function(*args, **kwargs)
> common/impala_test_suite.py:628: in execute_query_expect_success
> result = cls.__execute_query(impalad_client, query, query_options, user)
> common/impala_test_suite.py:722: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:180: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:187: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:364: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:385: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:IllegalArgumentException: no such partition id 6244
> {noformat}
> The failure is reproducible for me locally with catalog v2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8533) Impala daemon crash on sort

2019-05-15 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840659#comment-16840659
 ] 

Todd Lipcon commented on IMPALA-8533:
-

Interestingly, a UNION ALL with all constants produces a 0-size row, but a 
normal SELECT with all constants produces a row with space for the values:

{code}
Query: explain select 1 v from (select 1 union all select 1) t
+-+
| Explain String  |
+-+
| Max Per-Host Resource Reservation: Memory=0B Threads=1  |
| Per-Host Resource Estimates: Memory=10MB|
| Codegen disabled by planner |
| Analyzed query: SELECT CAST(1 AS TINYINT) v FROM (SELECT CAST(1 AS TINYINT) |
| UNION ALL SELECT CAST(1 AS TINYINT)) t  |
| |
| F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1   |
| Per-Host Resources: mem-estimate=0B mem-reservation=0B thread-reservation=1 |
|   PLAN-ROOT SINK|
|   |  mem-estimate=0B mem-reservation=0B thread-reservation=0|
|   | |
|   00:UNION  |
|  constant-operands=2|
|  mem-estimate=0B mem-reservation=0B thread-reservation=0|
|  tuple-ids=0 row-size=0B cardinality=2  |
|  in pipelines:|
+-+

Query: explain select 1 v from (select 1 ) t
++
| Explain String
 |
++
| Max Per-Host Resource Reservation: Memory=0B Threads=1
 |
| Per-Host Resource Estimates: Memory=10MB  
 |
| Codegen disabled by planner   
 |
| Analyzed query: SELECT CAST(1 AS TINYINT) v FROM (SELECT CAST(1 AS TINYINT)) 
t |
|   
 |
| F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 
 |
| Per-Host Resources: mem-estimate=0B mem-reservation=0B thread-reservation=1   
 |
|   PLAN-ROOT SINK  
 |
|   |  mem-estimate=0B mem-reservation=0B thread-reservation=0  
 |
|   |   
 |
|   00:UNION
 |
|  constant-operands=1  
 |
|  mem-estimate=0B mem-reservation=0B thread-reservation=0  
 |
|  tuple-ids=0 row-size=1B cardinality=1
 |
|  in pipelines:  
 |
++
{code}

> Impala daemon crash on sort
> ---
>
> Key: IMPALA-8533
> URL: https://issues.apache.org/jira/browse/IMPALA-8533
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Jeremy Beard
>Assignee: Todd Lipcon
>Priority: Blocker
>  Labels: crash
> Attachments: fatal_error.txt, hs_err_pid8552.log, query.txt
>
>
> Running the attached data generation query crashes the Impala coordinator 
> daemon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8533) Impala daemon crash on sort

2019-05-15 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840648#comment-16840648
 ] 

Todd Lipcon commented on IMPALA-8533:
-

The problem here seems to be that the sorter is getting optimized to have a 
zero-length sort tuple, so we crash on:
{code}
110 Sorter::Run::Run(Sorter* parent, TupleDescriptor* sort_tuple_desc, bool 
initial_run)
111   : sorter_(parent),
112 sort_tuple_desc_(sort_tuple_desc),
113 sort_tuple_size_(sort_tuple_desc->byte_size()),
114 page_capacity_(parent->page_len_ / sort_tuple_size_),
{code}
(sort_tuple_size_ is 0, so we div-by-zero)

> Impala daemon crash on sort
> ---
>
> Key: IMPALA-8533
> URL: https://issues.apache.org/jira/browse/IMPALA-8533
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Jeremy Beard
>Assignee: Todd Lipcon
>Priority: Blocker
>  Labels: crash
> Attachments: fatal_error.txt, hs_err_pid8552.log, query.txt
>
>
> Running the attached data generation query crashes the Impala coordinator 
> daemon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8533) Impala daemon crash on sort

2019-05-15 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840646#comment-16840646
 ] 

Todd Lipcon commented on IMPALA-8533:
-

A bit more minimized crasher:
```WITH
base_10 AS (
SELECT 1 UNION ALL SELECT 1
),
base_10k AS (
SELECT 2 constant FROM base_10 b1
)
SELECT ROW_NUMBER() OVER (ORDER BY b1.constant) row_num
FROM base_10k b1
```

> Impala daemon crash on sort
> ---
>
> Key: IMPALA-8533
> URL: https://issues.apache.org/jira/browse/IMPALA-8533
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Jeremy Beard
>Assignee: Todd Lipcon
>Priority: Blocker
>  Labels: crash
> Attachments: fatal_error.txt, hs_err_pid8552.log, query.txt
>
>
> Running the attached data generation query crashes the Impala coordinator 
> daemon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8533) Impala daemon crash on sort

2019-05-15 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8533:
---

Assignee: Todd Lipcon

> Impala daemon crash on sort
> ---
>
> Key: IMPALA-8533
> URL: https://issues.apache.org/jira/browse/IMPALA-8533
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Jeremy Beard
>Assignee: Todd Lipcon
>Priority: Blocker
>  Labels: crash
> Attachments: fatal_error.txt, hs_err_pid8552.log, query.txt
>
>
> Running the attached data generation query crashes the Impala coordinator 
> daemon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8551) GRANT gives confusing error message

2019-05-15 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8551:
---

 Summary: GRANT gives confusing error message
 Key: IMPALA-8551
 URL: https://issues.apache.org/jira/browse/IMPALA-8551
 Project: IMPALA
  Issue Type: Bug
  Components: Security
Affects Versions: Impala 3.3.0
Reporter: Todd Lipcon
Assignee: Fredy Wijaya


Tried performing a grant from the Impala shell in a kerberized cluster with 
Ranger enabled and got a strange error message:

```[nightly7x-2:21000] default> grant all on table t to group asdf;
Query: grant all on table t to group asdf
Query submitted at: 2019-05-15 11:24:20 (Coordinator: 
https://nightly7x-2.vpc.cloudera.com:25000)
ERROR: InternalException: HTTP 400 Error: syst...@vpc.cloudera.com is Not Found
```

Not sure what's going on here



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-2029) Impala in CDH 5.2.0 fails to compile with hadoop 2.7

2019-05-13 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-2029.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Impala in CDH 5.2.0 fails to compile with hadoop 2.7
> 
>
> Key: IMPALA-2029
> URL: https://issues.apache.org/jira/browse/IMPALA-2029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.2
> Environment: Red Hat 6.4 and gcc 4.3.4
>Reporter: Varun Saxena
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Compilation fails with below error message :
> ../../build/release/exec/libExec.a(hbase-table-scanner.cc.o): In function 
> `impala::HBaseTableScanner::Init()':
> /usr1/code/Impala/code/current/impala/be/src/exec/hbase-table-scanner.cc:113: 
> undefined reference to `getJNIEnv'
> ../../build/release/exprs/libExprs.a(hive-udf-call.cc.o):/usr1/code/Impala/code/current/impala/be/src/exprs/hive-udf-call.cc:227:
>  more undefined references to `getJNIEnv' follow
> collect2: ld returned 1 exit status
> make[3]: *** [be/build/release/service/impalad] Error 1
> make[2]: *** [be/src/service/CMakeFiles/impalad.dir/all] Error 2
> make[1]: *** [be/src/service/CMakeFiles/impalad.dir/rule] Error 2
> make: *** [impalad] Error 2
> Compiler Impala Failed, exit
> This issue is coming because libhdfs in hadoop 2.7.0 has made visibility of 
> getJNIEnv() as hidden.
> But shouldn't Impala create its own JNIEnv ?
> These HBase related files seem to have no direct connection with libhdfs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8542) Access trace collection for data cache

2019-05-13 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8542:
---

 Summary: Access trace collection for data cache
 Key: IMPALA-8542
 URL: https://issues.apache.org/jira/browse/IMPALA-8542
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Now that we have a remote-read data cache, it would be useful to log an access 
trace. The trace can be then fed back into various cache policy simulators to 
compare the relative performance, and do "what if" analysis (how would hit rate 
react with larger/smaller capacities)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8529) ccache is ignored when using ninja generator

2019-05-08 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8529:
---

 Summary: ccache is ignored when using ninja generator
 Key: IMPALA-8529
 URL: https://issues.apache.org/jira/browse/IMPALA-8529
 Project: IMPALA
  Issue Type: Task
Reporter: Todd Lipcon


The CMakeLists.txt sets up ccache by using RULE_LAUNCH_PREFIX, which is only 
respected by the Makefile generator. So, if we use ninja (which is generally 
better at parallelism) then ccache won't kick in. Newer versions of cmake have 
more explicit support for ccache that ought to also work with the ninja 
generator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-2029) Impala in CDH 5.2.0 fails to compile with hadoop 2.7

2019-05-07 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835166#comment-16835166
 ] 

Todd Lipcon commented on IMPALA-2029:
-

This has come up again with new Hadoop releases. Going to take another approach 
here: rather than trying to create the VM ourselves, we'll just call into a 
libhdfs function, let libhdfs do the attaching, and then use the GetEnv() JNI 
call to find the attached environment. That simplifies the worries about 
agreeing on who will do the _detach_.

> Impala in CDH 5.2.0 fails to compile with hadoop 2.7
> 
>
> Key: IMPALA-2029
> URL: https://issues.apache.org/jira/browse/IMPALA-2029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.2
> Environment: Red Hat 6.4 and gcc 4.3.4
>Reporter: Varun Saxena
>Assignee: Todd Lipcon
>Priority: Major
>
> Compilation fails with below error message :
> ../../build/release/exec/libExec.a(hbase-table-scanner.cc.o): In function 
> `impala::HBaseTableScanner::Init()':
> /usr1/code/Impala/code/current/impala/be/src/exec/hbase-table-scanner.cc:113: 
> undefined reference to `getJNIEnv'
> ../../build/release/exprs/libExprs.a(hive-udf-call.cc.o):/usr1/code/Impala/code/current/impala/be/src/exprs/hive-udf-call.cc:227:
>  more undefined references to `getJNIEnv' follow
> collect2: ld returned 1 exit status
> make[3]: *** [be/build/release/service/impalad] Error 1
> make[2]: *** [be/src/service/CMakeFiles/impalad.dir/all] Error 2
> make[1]: *** [be/src/service/CMakeFiles/impalad.dir/rule] Error 2
> make: *** [impalad] Error 2
> Compiler Impala Failed, exit
> This issue is coming because libhdfs in hadoop 2.7.0 has made visibility of 
> getJNIEnv() as hidden.
> But shouldn't Impala create its own JNIEnv ?
> These HBase related files seem to have no direct connection with libhdfs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-2029) Impala in CDH 5.2.0 fails to compile with hadoop 2.7

2019-05-07 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-2029:
---

Assignee: Todd Lipcon  (was: Joe McDonnell)

> Impala in CDH 5.2.0 fails to compile with hadoop 2.7
> 
>
> Key: IMPALA-2029
> URL: https://issues.apache.org/jira/browse/IMPALA-2029
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.2
> Environment: Red Hat 6.4 and gcc 4.3.4
>Reporter: Varun Saxena
>Assignee: Todd Lipcon
>Priority: Major
>
> Compilation fails with below error message :
> ../../build/release/exec/libExec.a(hbase-table-scanner.cc.o): In function 
> `impala::HBaseTableScanner::Init()':
> /usr1/code/Impala/code/current/impala/be/src/exec/hbase-table-scanner.cc:113: 
> undefined reference to `getJNIEnv'
> ../../build/release/exprs/libExprs.a(hive-udf-call.cc.o):/usr1/code/Impala/code/current/impala/be/src/exprs/hive-udf-call.cc:227:
>  more undefined references to `getJNIEnv' follow
> collect2: ld returned 1 exit status
> make[3]: *** [be/build/release/service/impalad] Error 1
> make[2]: *** [be/src/service/CMakeFiles/impalad.dir/all] Error 2
> make[1]: *** [be/src/service/CMakeFiles/impalad.dir/rule] Error 2
> make: *** [impalad] Error 2
> Compiler Impala Failed, exit
> This issue is coming because libhdfs in hadoop 2.7.0 has made visibility of 
> getJNIEnv() as hidden.
> But shouldn't Impala create its own JNIEnv ?
> These HBase related files seem to have no direct connection with libhdfs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-8516) Update maven on Jenkins Ubuntu build slaves

2019-05-07 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-8516:
---

Assignee: Todd Lipcon

> Update maven on Jenkins Ubuntu build slaves
> ---
>
> Key: IMPALA-8516
> URL: https://issues.apache.org/jira/browse/IMPALA-8516
> Project: IMPALA
>  Issue Type: Task
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> Currently we're installing maven from an apt repository, which ends up giving 
> us a relatively old version. It seems we might be hitting HTTPCLIENT-1478 in 
> the version of httpclient that ends up getting bundled into that package. We 
> should update to an explicitly downloaded maven tarball and see if it fixes 
> the hangs in SSL connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8516) Update maven on Jenkins Ubuntu build slaves

2019-05-07 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8516:
---

 Summary: Update maven on Jenkins Ubuntu build slaves
 Key: IMPALA-8516
 URL: https://issues.apache.org/jira/browse/IMPALA-8516
 Project: IMPALA
  Issue Type: Task
Reporter: Todd Lipcon


Currently we're installing maven from an apt repository, which ends up giving 
us a relatively old version. It seems we might be hitting HTTPCLIENT-1478 in 
the version of httpclient that ends up getting bundled into that package. We 
should update to an explicitly downloaded maven tarball and see if it fixes the 
hangs in SSL connections.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8509) Data load schema generation should lazily evaluate shell substitutions

2019-05-06 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8509:
---

 Summary: Data load schema generation should lazily evaluate shell 
substitutions
 Key: IMPALA-8509
 URL: https://issues.apache.org/jira/browse/IMPALA-8509
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: Todd Lipcon
Assignee: Todd Lipcon


Some of the data loading commands (testescape_* tables in particular) use shell 
statements to execute. These are evaluated by the 'eval_section' function in 
generate-schema-statements.py. However, the 'eval_section' call was eagerly 
evaluated even if the table already existed, which makes 
'generate-schema-statements' quite slow on every run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-3292) Kudu scanner should not fail if KeepAlive request fails

2019-04-30 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-3292:

Fix Version/s: Impala 2.7.0

> Kudu scanner should not fail if KeepAlive request fails
> ---
>
> Key: IMPALA-3292
> URL: https://issues.apache.org/jira/browse/IMPALA-3292
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Kudu_Impala
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: Impala 2.7.0
>
>
> The KeepKuduScannerAlive() function can fail with ServiceUnavailable if it is 
> running against a very heavily-loaded tablet server. This is OK -- it should 
> ignore the warning and just try again after a short time, rather than failing 
> the whole query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8454) Recursively list files within transactional tables

2019-04-29 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-8454:

Docs Text: Files will now be listed recursively within tables. This 
improves compatibility with Hive-on-Tez and other execution engines.

> Recursively list files within transactional tables
> --
>
> Key: IMPALA-8454
> URL: https://issues.apache.org/jira/browse/IMPALA-8454
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: impala-acid
> Fix For: Impala 3.3.0
>
>
> For transactional tables, the data files are not directly within the 
> partition directories, but instead are stored within subdirectories 
> corresponding to writeIds, compactions, etc. To support this, we need to be 
> able to recursively load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8454) Recursively list files within transactional tables

2019-04-29 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8454.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Recursively list files within transactional tables
> --
>
> Key: IMPALA-8454
> URL: https://issues.apache.org/jira/browse/IMPALA-8454
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>  Labels: impala-acid
> Fix For: Impala 3.3.0
>
>
> For transactional tables, the data files are not directly within the 
> partition directories, but instead are stored within subdirectories 
> corresponding to writeIds, compactions, etc. To support this, we need to be 
> able to recursively load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8454) Recursively list files within transactional tables

2019-04-25 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16826527#comment-16826527
 ] 

Todd Lipcon commented on IMPALA-8454:
-

Chatting with Gopal, it turns out that actually the Hive-on-Tez behavior is to 
always recursively list directories, including for partitions of external 
tables. This is actually important for interop on external tables because Tez 
will write to subdirectories for an insert if the insert contains a 'UNION ALL' 
(even in Hive 2 non-ACID).

We should probably consider making this a global flag and enabling by default 
(with the ability to roll back in case it breaks someone)

> Recursively list files within transactional tables
> --
>
> Key: IMPALA-8454
> URL: https://issues.apache.org/jira/browse/IMPALA-8454
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> For transactional tables, the data files are not directly within the 
> partition directories, but instead are stored within subdirectories 
> corresponding to writeIds, compactions, etc. To support this, we need to be 
> able to recursively load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8454) Recursively list files within transactional tables

2019-04-25 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8454:
---

 Summary: Recursively list files within transactional tables
 Key: IMPALA-8454
 URL: https://issues.apache.org/jira/browse/IMPALA-8454
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Todd Lipcon
Assignee: Todd Lipcon


For transactional tables, the data files are not directly within the partition 
directories, but instead are stored within subdirectories corresponding to 
writeIds, compactions, etc. To support this, we need to be able to recursively 
load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-8454) Recursively list files within transactional tables

2019-04-25 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-8454:

Issue Type: Improvement  (was: Bug)

> Recursively list files within transactional tables
> --
>
> Key: IMPALA-8454
> URL: https://issues.apache.org/jira/browse/IMPALA-8454
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> For transactional tables, the data files are not directly within the 
> partition directories, but instead are stored within subdirectories 
> corresponding to writeIds, compactions, etc. To support this, we need to be 
> able to recursively load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8406) Failed REFRESH can partially modify table without bumping version number

2019-04-10 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814951#comment-16814951
 ] 

Todd Lipcon commented on IMPALA-8406:
-

Another oddity is the message we have today which says "Failed to load file 
metadata for 1 paths for table u_todd.test. Table's file metadata could be 
partially loaded. Check the Catalog server log for more details."

The "metadata could be partially loaded" sounds like you'll be able to query 
the table except for the partitions that had an error, which is sort-of-true 
after a REFRESH, but this same error message is currently used on an initial 
load of a table, in which case the table won't be queryable at all.

> Failed REFRESH can partially modify table without bumping version number
> 
>
> Key: IMPALA-8406
> URL: https://issues.apache.org/jira/browse/IMPALA-8406
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently, various incremental operations in the catalogd modify Table 
> objects in place, including REFRESH, which modifies each partition. In this 
> case, if one partition fails to refresh (eg due to incorrect partitions or 
> some other file access problem), other partitions can still be modified, 
> either because they were modified first (in a non-parallel operation) or 
> modified in parallel (for REFRESH).
> In this case, the REFRESH operation will throw an Exception back to the user, 
> but in fact it has modified the catalog entry. The version number, however, 
> is not bumped, which breaks some invariants of the catalog that an object 
> doesn't change without changing version numbers.
> This also produces some unexpected behavior such as:
> - SHOW FILES IN t;
> - REFRESH t; -- gets a failure
> - SHOW FILES in t; -- see the same result as originally
> - ALTER TABLE t SET UNCACHED; -- bumps the version number due to unrelated 
> operation
> - SHOW FILES IN t; -- the set of files has changed due to the earlier 
> partially-complete REFRESH



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8406) Failed REFRESH can partially modify table without bumping version number

2019-04-10 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814948#comment-16814948
 ] 

Todd Lipcon commented on IMPALA-8406:
-

It seems like the expected behavior here would be one of:
(a) mark the table as an IncompleteTable with a table loading exception, 
indicating the failure of the REFRESH (as if it were freshly loaded after 
INVALIDATE METADATA). Any future access attempt will get this exception.
(b) leave the table unchanged, as if the refresh had not been executed at all.

The current behavior seems like worst of both worlds.


> Failed REFRESH can partially modify table without bumping version number
> 
>
> Key: IMPALA-8406
> URL: https://issues.apache.org/jira/browse/IMPALA-8406
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.2.0
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently, various incremental operations in the catalogd modify Table 
> objects in place, including REFRESH, which modifies each partition. In this 
> case, if one partition fails to refresh (eg due to incorrect partitions or 
> some other file access problem), other partitions can still be modified, 
> either because they were modified first (in a non-parallel operation) or 
> modified in parallel (for REFRESH).
> In this case, the REFRESH operation will throw an Exception back to the user, 
> but in fact it has modified the catalog entry. The version number, however, 
> is not bumped, which breaks some invariants of the catalog that an object 
> doesn't change without changing version numbers.
> This also produces some unexpected behavior such as:
> - SHOW FILES IN t;
> - REFRESH t; -- gets a failure
> - SHOW FILES in t; -- see the same result as originally
> - ALTER TABLE t SET UNCACHED; -- bumps the version number due to unrelated 
> operation
> - SHOW FILES IN t; -- the set of files has changed due to the earlier 
> partially-complete REFRESH



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-8406) Failed REFRESH can partially modify table without bumping version number

2019-04-10 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-8406:
---

 Summary: Failed REFRESH can partially modify table without bumping 
version number
 Key: IMPALA-8406
 URL: https://issues.apache.org/jira/browse/IMPALA-8406
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.2.0
Reporter: Todd Lipcon


Currently, various incremental operations in the catalogd modify Table objects 
in place, including REFRESH, which modifies each partition. In this case, if 
one partition fails to refresh (eg due to incorrect partitions or some other 
file access problem), other partitions can still be modified, either because 
they were modified first (in a non-parallel operation) or modified in parallel 
(for REFRESH).

In this case, the REFRESH operation will throw an Exception back to the user, 
but in fact it has modified the catalog entry. The version number, however, is 
not bumped, which breaks some invariants of the catalog that an object doesn't 
change without changing version numbers.

This also produces some unexpected behavior such as:
- SHOW FILES IN t;
- REFRESH t; -- gets a failure
- SHOW FILES in t; -- see the same result as originally
- ALTER TABLE t SET UNCACHED; -- bumps the version number due to unrelated 
operation
- SHOW FILES IN t; -- the set of files has changed due to the earlier 
partially-complete REFRESH





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7450) catalogd should use thread names to make jstack more readable

2019-04-10 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-7450.
-
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> catalogd should use thread names to make jstack more readable
> -
>
> Key: IMPALA-7450
> URL: https://issues.apache.org/jira/browse/IMPALA-7450
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>  Labels: supportability
> Fix For: Impala 3.2.0
>
>
> Currently when long refresh or DDL operations are being processed, it's hard 
> to understand what's going on when looking at a jstack. We should have such 
> potentially-long-running operations temporarily modify the current thread's 
> name to indicate what action is being taken so we can debug more easily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7540) Intern common strings in catalog

2019-04-10 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-7540.
-
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> Intern common strings in catalog
> 
>
> Key: IMPALA-7540
> URL: https://issues.apache.org/jira/browse/IMPALA-7540
> Project: IMPALA
>  Issue Type: Improvement
>Affects Versions: Impala 3.1.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Fix For: Impala 3.2.0
>
>
> Using jxray shows that there are many common duplicate strings in the 
> catalog. For example, each table repeats the database name, and metadata like 
> the HMS parameter maps reuse a lot of common strings like "EXTERNAL" or 
> "transient_lastDdlTime". We should intern these to save memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7047) REFRESH on unpartitioned tables calls getBlockLocations on every file

2019-04-10 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-7047.
-
   Resolution: Fixed
Fix Version/s: Impala 3.2.0

Yep, thanks for catching this.

> REFRESH on unpartitioned tables calls getBlockLocations on every file
> -
>
> Key: IMPALA-7047
> URL: https://issues.apache.org/jira/browse/IMPALA-7047
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.13.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>  Labels: metadata
> Fix For: Impala 3.2.0
>
>
> In HdfsTable.updateUnpartitionedTableFileMd() the existing default Partition 
> object is reset, and a new empty one is created. It then calls 
> refreshPartitionFileMetadata with this new partition which has an empty list 
> of file descriptors. This ends up listing the directory, and for each file, 
> since it doesn't find it in the empty descriptor list, will make a separate 
> RPC to HDFS to get the locations.
> This is quite wasteful vs just using the API that returns the located 
> statuses for the directory.
> Alternatively, it seems like it should probably keep around the old file 
> descriptor list in the new Partition object so that the incremental refresh 
> path can work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-4475) Compress ExecPlanFragment before shipping it to worker nodes to reduce network traffic

2019-03-28 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-4475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16804219#comment-16804219
 ] 

Todd Lipcon commented on IMPALA-4475:
-

Perhaps at this point it would be better to move the appropriate 
ExecQueryFInstances RPC to krpc, and then implement optional compression in 
general for KRPC? Or if we end up using a sidecar to encapsulate the serialized 
thrift plan (because converting it to protobuf is a ton of work) we can easily 
compress just the sidecar.

> Compress ExecPlanFragment before shipping it to worker nodes to reduce 
> network traffic
> --
>
> Key: IMPALA-4475
> URL: https://issues.apache.org/jira/browse/IMPALA-4475
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Distributed Exec
>Affects Versions: Impala 2.6.0
>Reporter: Mostafa Mokhtar
>Assignee: Vuk Ercegovac
>Priority: Major
>  Labels: ramp-up, scalability
> Attachments: count_store_returns.txt.zip, 
> slow_query_start_250K_partitions_134nodes.txt
>
>
> Sending the ExecPlanFragment to remote nodes dominates the query startup time 
> on clusters larger than 100 nodes, size of the ExecPlanFragment grows with 
> number of tables, blocks and partitions in the table. 
> On large cluster this is limits query throughput.
> From TPC-DS Q11 on 1K node cluster
> {code}
> Query Timeline: 5m6s
>- Query submitted: 75.256us (75.256us)
>- Planning finished: 1s580ms (1s580ms)
>- Submit for admission: 2s376ms (795.652ms)
>- Completed admission: 2s377ms (1.512ms)
>- Ready to start 15993 fragment instances: 2s458ms (80.378ms)
>- First dynamic filter received: 2m35s (2m33s)
>- All 15993 fragment instances started: 2m35s (40.934ms)
>- Rows available: 4m53s (2m17s)
>- First row fetched: 4m53s (176.254ms)
>- Unregister query: 4m58s (4s828ms)
>  - ComputeScanRangeAssignmentTimer: 600.086ms
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-3430) Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans

2019-03-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798575#comment-16798575
 ] 

Todd Lipcon commented on IMPALA-3430:
-

One particularly common case (and maybe easiest to implement) is an 
uncorrelated subquery. For example, a query like:

{code}
select count(*) from t where c < (select avg(c) from t);
{code}

This gets planned as a nested-loop-join against a one-row table (materialized 
from the subquery). In that case it's trivial to take the non-equijoin and 
propagate it to a runtime "max" filter on the scan.

(it may be that the fix for this special case falls out of a more general 
implementation, but if the general implementation is tough it might be worth 
attacking this one because it's relatively common)

> Runtime filter : Extend runtime filter to support Min/Max values for HDFS 
> scans
> ---
>
> Key: IMPALA-3430
> URL: https://issues.apache.org/jira/browse/IMPALA-3430
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.6.0
>Reporter: Mostafa Mokhtar
>Priority: Minor
>  Labels: performance, runtime-filters
>
> Annotating Runtime filters with Min/Max values can help with
> * Inequality joins 
> * Pushing more efficient filters to the scan
> * Used to skip reading Parquet blocks reducing IO.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-8322) S3 tests encounter "timed out waiting for receiver fragment instance"

2019-03-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798551#comment-16798551
 ] 

Todd Lipcon commented on IMPALA-8322:
-

I'll put up a review in a minute with more TRACE() calls in these code paths, 
hopefully will get a better smoking gun out of the tests if they keep failing.

> S3 tests encounter "timed out waiting for receiver fragment instance"
> -
>
> Key: IMPALA-8322
> URL: https://issues.apache.org/jira/browse/IMPALA-8322
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
> Attachments: run_tests_swimlane.json.gz
>
>
> This has been seen multiple times when running s3 tests:
> {noformat}
> query_test/test_join_queries.py:57: in test_basic_joins
> self.run_test_case('QueryTest/joins', new_vector)
> common/impala_test_suite.py:472: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:699: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:174: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:183: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:360: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:381: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Sender 127.0.0.1 timed out waiting for receiver fragment 
> instance: 6c40d992bb87af2f:0ce96e5d0007, dest node: 4{noformat}
> This is related to IMPALA-6818. On a bad run, there are various time outs in 
> the impalad logs:
> {noformat}
> I0316 10:47:16.359313 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1 
> timed out waiting for receiver fragment instance: 
> ef4a5dc32a6565bd:a8720b850007, dest node: 5
> I0316 10:47:16.359345 20175 rpcz_store.cc:265] Call 
> impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id 
> 14881) took 120182ms. Request Metrics: {}
> I0316 10:47:16.359380 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1 
> timed out waiting for receiver fragment instance: 
> d148d83e11a4603d:54dc35f70004, dest node: 3
> I0316 10:47:16.359395 20175 rpcz_store.cc:265] Call 
> impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id 
> 14880) took 123097ms. Request Metrics: {}
> ... various messages ...
> I0316 10:47:56.364990 20154 kudu-util.h:108] Cancel() RPC failed: Timed out: 
> CancelQueryFInstances RPC to 127.0.0.1:27000 timed out after 10.000s (SENT)
> ... various messages ...
> W0316 10:48:15.056421 20150 rpcz_store.cc:251] Call 
> impala.ControlService.CancelQueryFInstances from 127.0.0.1:40912 (request 
> call id 202) took 48695ms (client timeout 1).
> W0316 10:48:15.056473 20150 rpcz_store.cc:255] Trace:
> 0316 10:47:26.361265 (+ 0us) impala-service-pool.cc:165] Inserting onto call 
> queue
> 0316 10:47:26.361285 (+ 20us) impala-service-pool.cc:245] Handling call
> 0316 10:48:15.056398 (+48695113us) inbound_call.cc:162] Queueing success 
> response
> Metrics: {}
> I0316 10:48:15.057087 20139 connection.cc:584] Got response to call id 202 
> after client already timed out or cancelled{noformat}
> So far, this has only happened on s3. The system load at the time is not 
> higher than normal. If anything it is lower than normal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8316) Update re2 to avoid lock contention

2019-03-19 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8316.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Update re2 to avoid lock contention
> ---
>
> Key: IMPALA-8316
> URL: https://issues.apache.org/jira/browse/IMPALA-8316
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>  Labels: perf
> Fix For: Impala 3.3.0
>
>
> I ran the following test query and found that it spent a lot of time in lock 
> contention within the re2 library:
> ```select sum(l_linenumber) from item_20x where 
> regexp_extract(l_shipinstruct, '.*E', 0) like '%E' ;```
> I think this lock contention would happen on any regex that involves 
> backtracking. This was fixed in the re2 library upstream in 
> https://github.com/google/re2/commit/eb00dfdd82015be22086cacc6bf830f72a10e2bc#diff-a60a8d25ed15adf68b94c85775fd3cf7
> We should consider upgrading re2 to the latest release, or if not that, at 
> least cherry-picking this perf fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-5393) Regexp should use THREAD_LOCAL context rather than FRAGMENT_LOCAL

2019-03-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-5393.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Regexp should use THREAD_LOCAL context rather than FRAGMENT_LOCAL
> -
>
> Key: IMPALA-5393
> URL: https://issues.apache.org/jira/browse/IMPALA-5393
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Doug Cameron
>Assignee: Todd Lipcon
>Priority: Minor
> Fix For: Impala 3.3.0
>
>
> The RE2 library uses mutex locking around some internal state structures.  
> This causes severe lock contention and lack of CPU scaling on the regexp 
> string function.
> Switching to THREAD_LOCAL context will remove the contention.
> We could add a query option to select between FRAGMENT_LOCAL vs THREAD_LOCAL 
> but that seems overkill as the context is not huge.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-8283) Creating a Kudu table doesn't take the primary key order specified

2019-03-18 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved IMPALA-8283.
-
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Creating a Kudu table doesn't take the primary key order specified
> --
>
> Key: IMPALA-8283
> URL: https://issues.apache.org/jira/browse/IMPALA-8283
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Brian Hausmann
>Assignee: Todd Lipcon
>Priority: Major
>  Labels: impala, impala-kudu, kudu
> Fix For: Impala 3.3.0
>
>
> *Example1:*
> When trying to create a new Kudu table from an existing Kudu table with a 
> different primary key order, Impala takes the primary key structure from the 
> existing Kudu Table instead of the new key order specified in the create 
> table statement.
> existingtable: (a string, b string, c string, d boolean) primary key (a, b, 
> c) partition by hash (a) stored as kudu
>  
> {code:java}
> create table newtable primary key (b, a, c) partition by hash (a) stored as 
> kudu as Select * from table1
> {code}
>  
> Result: newtable is created with the same primary key order (a, b, c) as 
> table1 instead of specified (b, a, c) order
>  
> The workaround for this was to create an empty table with the correct key 
> structure and field order and then insert the data into the table
>  
> *Example 2:*
> Issue also exists when schema has a different order than the primary key 
> specified.
> {code:java}
> Create table newtable2 (a string, b string, c string, d boolean) primary key 
> (b, a , c) partition by hash(a) stored as kudu{code}
> Result: newtable2 has PK order of (a,b,c) instead of specified (b, a, c) and 
> no error is given



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-5393) Regexp should use THREAD_LOCAL context rather than FRAGMENT_LOCAL

2019-03-16 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned IMPALA-5393:
---

Assignee: Todd Lipcon  (was: John Sherman)

> Regexp should use THREAD_LOCAL context rather than FRAGMENT_LOCAL
> -
>
> Key: IMPALA-5393
> URL: https://issues.apache.org/jira/browse/IMPALA-5393
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Doug Cameron
>Assignee: Todd Lipcon
>Priority: Minor
>
> The RE2 library uses mutex locking around some internal state structures.  
> This causes severe lock contention and lack of CPU scaling on the regexp 
> string function.
> Switching to THREAD_LOCAL context will remove the contention.
> We could add a query option to select between FRAGMENT_LOCAL vs THREAD_LOCAL 
> but that seems overkill as the context is not huge.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

1 2 3 >

1 - 100 of 250 matches

Mail list logo