[jira] [Created] (HIVE-27971) Sentence does not make sense
Sebb created HIVE-27971: --- Summary: Sentence does not make sense Key: HIVE-27971 URL: https://issues.apache.org/jira/browse/HIVE-27971 Project: Hive Issue Type: Bug Components: Documentation Reporter: Sebb The main page https://hive.apache.org/ has the sentence: "Hive provides full acid support for ORC tables out and insert only support to all other formats." This does not make sense to me. Also, I think 'acid' should be 'ACID'. What does ORC stand for? It would help to link acronyms such as ACID, ORC to a glossary (or provide suitable hover text). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27970) Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS)
[ https://issues.apache.org/jira/browse/HIVE-27970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801006#comment-17801006 ] Butao Zhang commented on HIVE-27970: For Hive and this issue, {*}across different filesystem{*}(hdfs://ns1://bucket1) and {*}different path schema{*}(hdfs://ns1://ns2) are the same thing. If you want to store(insert) partition data into {*}different path schema{*}(hdfs://ns1://ns2), you will encounter the same issue. > Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS) > - > > Key: HIVE-27970 > URL: https://issues.apache.org/jira/browse/HIVE-27970 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: zhixingheyi-tian >Priority: Major > > Single Hive/Datasource table partitioning to multiple storage system- (e.g, > S3 and HDFS) > For Hive table: > > {code:java} > CREATE TABLE htable a string, b string) PARTITIONED BY ( p string ) > location "hdfs://{cluster}}/user/hadoop/htable/"; > alter table htable add partition(p='p1') location > 's3a://{bucketname}/usr/hive/warehouse/htable/p=p1'; > {code} > > When inserting into htable, or insert overwrite htable. New data of “p=p1” > will insert table location storage. This does not meet the requirements. > Is there any best practise? Or is there a plan to support this feature? > Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27970) Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS)
[ https://issues.apache.org/jira/browse/HIVE-27970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17801002#comment-17801002 ] zhixingheyi-tian commented on HIVE-27970: - Hi. [~zhangbutao] , I think the key challenge is across different filesystem. Thanks > Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS) > - > > Key: HIVE-27970 > URL: https://issues.apache.org/jira/browse/HIVE-27970 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: zhixingheyi-tian >Priority: Major > > Single Hive/Datasource table partitioning to multiple storage system- (e.g, > S3 and HDFS) > For Hive table: > > {code:java} > CREATE TABLE htable a string, b string) PARTITIONED BY ( p string ) > location "hdfs://{cluster}}/user/hadoop/htable/"; > alter table htable add partition(p='p1') location > 's3a://{bucketname}/usr/hive/warehouse/htable/p=p1'; > {code} > > When inserting into htable, or insert overwrite htable. New data of “p=p1” > will insert table location storage. This does not meet the requirements. > Is there any best practise? Or is there a plan to support this feature? > Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (HIVE-27970) Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS)
[ https://issues.apache.org/jira/browse/HIVE-27970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800995#comment-17800995 ] Butao Zhang edited comment on HIVE-27970 at 12/28/23 1:08 PM: -- I think this related to the ticket https://issues.apache.org/jira/browse/HIVE-1707, before HIVE-1707, you can insert data to its correct partition's schema path instead of its table's schema. [https://community.cloudera.com/t5/Community-Articles/Hive-partitions-on-different-Namespaces-in-a-Federated/ta-p/248041] here is a blog about how to store partiton data into different schema(hdfs://ns1, hdfs://ns2, s3a://bucket1). Note this blog can not fix your issue at bottom, it's just a workaround. Just fyi. But I think your use case is reasonable, especially in large hdfs cluster. We need to optimize this issue at its root. was (Author: zhangbutao): I think this related to the ticket https://issues.apache.org/jira/browse/HIVE-1707, before HIVE-1707, you can insert data to its correct partition's schema path instead of its table's schema. [https://community.cloudera.com/t5/Community-Articles/Hive-partitions-on-different-Namespaces-in-a-Federated/ta-p/248041] here is a blog about how to store partiton data into different schema(hdfs://ns1, hdfs://ns2). Note this blog can not fix your issue at bottom, it's just a workaround. Just fyi. But I think your use case is reasonable, especially in large hdfs cluster. We need to optimize this issue at its root. > Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS) > - > > Key: HIVE-27970 > URL: https://issues.apache.org/jira/browse/HIVE-27970 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: zhixingheyi-tian >Priority: Major > > Single Hive/Datasource table partitioning to multiple storage system- (e.g, > S3 and HDFS) > For Hive table: > > {code:java} > CREATE TABLE htable a string, b string) PARTITIONED BY ( p string ) > location "hdfs://{cluster}}/user/hadoop/htable/"; > alter table htable add partition(p='p1') location > 's3a://{bucketname}/usr/hive/warehouse/htable/p=p1'; > {code} > > When inserting into htable, or insert overwrite htable. New data of “p=p1” > will insert table location storage. This does not meet the requirements. > Is there any best practise? Or is there a plan to support this feature? > Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27970) Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS)
[ https://issues.apache.org/jira/browse/HIVE-27970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17800995#comment-17800995 ] Butao Zhang commented on HIVE-27970: I think this related to the ticket https://issues.apache.org/jira/browse/HIVE-1707, before HIVE-1707, you can insert data to its correct partition's schema path instead of its table's schema. [https://community.cloudera.com/t5/Community-Articles/Hive-partitions-on-different-Namespaces-in-a-Federated/ta-p/248041] here is a blog about how to store partiton data into different schema(hdfs://ns1, hdfs://ns2). Note this blog can not fix your issue at bottom, it's just a workaround. Just fyi. But I think your use case is reasonable, especially in large hdfs cluster. We need to optimize this issue at its root. > Single Hive table partitioning to multiple storage system- (e.g, S3 and HDFS) > - > > Key: HIVE-27970 > URL: https://issues.apache.org/jira/browse/HIVE-27970 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.1.2 >Reporter: zhixingheyi-tian >Priority: Major > > Single Hive/Datasource table partitioning to multiple storage system- (e.g, > S3 and HDFS) > For Hive table: > > {code:java} > CREATE TABLE htable a string, b string) PARTITIONED BY ( p string ) > location "hdfs://{cluster}}/user/hadoop/htable/"; > alter table htable add partition(p='p1') location > 's3a://{bucketname}/usr/hive/warehouse/htable/p=p1'; > {code} > > When inserting into htable, or insert overwrite htable. New data of “p=p1” > will insert table location storage. This does not meet the requirements. > Is there any best practise? Or is there a plan to support this feature? > Thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)