[jira] [Created] (HIVE-28366) Iceberg: Concurrent Insert and IOW produce incorrect result

2024-07-10 Thread Denys Kuzmenko (Jira)
Denys Kuzmenko created HIVE-28366:
-

 Summary: Iceberg: Concurrent Insert and IOW produce incorrect 
result 
 Key: HIVE-28366
 URL: https://issues.apache.org/jira/browse/HIVE-28366
 Project: Hive
  Issue Type: Bug
  Components: Iceberg integration
Affects Versions: 4.0.0
Reporter: Denys Kuzmenko


1. create a table and insert some data:
{code}
create table ice_t (i int, p int) partitioned by spec (truncate(10, i)) stored 
by iceberg;

insert into ice_t values (1, 1), (2, 2);
insert into ice_t values (10, 10), (20, 20);
insert into ice_t values (40, 40), (30, 30);
{code}
Then concurrently execute the following jobs:
Job 1:
{code}
insert into ice_void select i*100, p*100 from ice_void;
{code}
Job 2:
{code}
insert overwrite ice_void select i+1, p+1 from ice_void;
{code}
If Job 1 finishes first, Job 2 still succeeds for me, and after that the table 
content will be the following:
{code}
2  2
3  3
11 11
21 21
31 31
41 41
100100
200200
1000   1000
2000   2000
3000   3000
4000   4000
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28364) Iceberg: Upgrade iceberg version to 1.5.2

2024-07-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-28364:
--
Labels: pull-request-available  (was: )

> Iceberg: Upgrade iceberg version to 1.5.2
> -
>
> Key: HIVE-28364
> URL: https://issues.apache.org/jira/browse/HIVE-28364
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-28364) Iceberg: Upgrade iceberg version to 1.5.2

2024-07-10 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-28364 started by Denys Kuzmenko.
-
> Iceberg: Upgrade iceberg version to 1.5.2
> -
>
> Key: HIVE-28364
> URL: https://issues.apache.org/jira/browse/HIVE-28364
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-28364) Iceberg: Upgrade iceberg version to 1.5.2

2024-07-10 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-28364:
-

Assignee: Denys Kuzmenko

> Iceberg: Upgrade iceberg version to 1.5.2
> -
>
> Key: HIVE-28364
> URL: https://issues.apache.org/jira/browse/HIVE-28364
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28341) Iceberg: Change Major QB Full Table Compaction to compact partitions in parallel

2024-07-10 Thread Dmitriy Fingerman (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Fingerman updated HIVE-28341:
-
Summary: Iceberg: Change Major QB Full Table Compaction to compact 
partitions in parallel  (was: Iceberg: Change Major QB Full Table Compaction to 
compact partition by partition)

> Iceberg: Change Major QB Full Table Compaction to compact partitions in 
> parallel
> 
>
> Key: HIVE-28341
> URL: https://issues.apache.org/jira/browse/HIVE-28341
> Project: Hive
>  Issue Type: Task
>  Components: Hive, Iceberg integration
>Reporter: Dmitriy Fingerman
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: hive, iceberg, pull-request-available
>
> Currently, Major compaction compacts a whole table in one step. If a table is 
> partition and has a lot of data this operation can take a lot of time and it 
> risks getting write conflicts at the commit stage. This can be improved to 
> work partition by partition. Also, for each partition it will create one 
> snapshot instead of 2 snapshots (truncate+IOW) created now when compacting 
> the whole table in one step.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28365) IGNORE

2024-07-10 Thread Cheng Pan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Pan resolved HIVE-28365.
--
Fix Version/s: Not Applicable
   Resolution: Not A Problem

> IGNORE
> --
>
> Key: HIVE-28365
> URL: https://issues.apache.org/jira/browse/HIVE-28365
> Project: Hive
>  Issue Type: Wish
>Reporter: Cheng Pan
>Priority: Major
> Fix For: Not Applicable
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28365) Don't fail CI when no tests changed

2024-07-10 Thread Cheng Pan (Jira)
Cheng Pan created HIVE-28365:


 Summary: Don't fail CI when no tests changed
 Key: HIVE-28365
 URL: https://issues.apache.org/jira/browse/HIVE-28365
 Project: Hive
  Issue Type: Wish
Reporter: Cheng Pan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-28365) IGNORE

2024-07-10 Thread Cheng Pan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Pan updated HIVE-28365:
-
Summary: IGNORE  (was: Don't fail CI when no tests changed)

> IGNORE
> --
>
> Key: HIVE-28365
> URL: https://issues.apache.org/jira/browse/HIVE-28365
> Project: Hive
>  Issue Type: Wish
>Reporter: Cheng Pan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-28364) Iceberg: Upgrade iceberg version to 1.5.2

2024-07-10 Thread Denys Kuzmenko (Jira)
Denys Kuzmenko created HIVE-28364:
-

 Summary: Iceberg: Upgrade iceberg version to 1.5.2
 Key: HIVE-28364
 URL: https://issues.apache.org/jira/browse/HIVE-28364
 Project: Hive
  Issue Type: Task
Reporter: Denys Kuzmenko






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-28354) Rename NegativeLlapCliDriver to NegativeLlapCliConfig

2024-07-10 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-28354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17864524#comment-17864524
 ] 

Ayush Saxena commented on HIVE-28354:
-

Committed to master.
Thanx [~InvisibleProgrammer] for the contribution & [~abstractdog] for the 
review!!!

> Rename NegativeLlapCliDriver to NegativeLlapCliConfig
> -
>
> Key: HIVE-28354
> URL: https://issues.apache.org/jira/browse/HIVE-28354
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 4.1.0
>
>
> https://github.com/apache/hive/blob/74b9c88aced9407351f6635769a4bd48214fca1e/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java#L364
> this is a config (extending an abstract one), not a driver, rename it to 
> avoid confusion



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28354) Rename NegativeLlapCliDriver to NegativeLlapCliConfig

2024-07-10 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HIVE-28354.
-
Resolution: Fixed

> Rename NegativeLlapCliDriver to NegativeLlapCliConfig
> -
>
> Key: HIVE-28354
> URL: https://issues.apache.org/jira/browse/HIVE-28354
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: Zsolt Miskolczi
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 4.1.0
>
>
> https://github.com/apache/hive/blob/74b9c88aced9407351f6635769a4bd48214fca1e/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CliConfigs.java#L364
> this is a config (extending an abstract one), not a driver, rename it to 
> avoid confusion



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-28316) The documentation provides an ambiguous explanation regarding the mutually exclusive nature of `STORED BY` and `STORED AS`

2024-07-10 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng resolved HIVE-28316.

Fix Version/s: Not Applicable
   Resolution: Fixed

Updated the document. Thank you for the issue report, [~linghengqian]!

> The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of `STORED BY` and `STORED AS`
> --
>
> Key: HIVE-28316
> URL: https://issues.apache.org/jira/browse/HIVE-28316
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Qiheng He
>Assignee: Zhihua Deng
>Priority: Major
> Fix For: Not Applicable
>
>
> - The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of {*}STORED BY{*} and {*}STORED AS{*}.
> - As mentioned on 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers , when the 
> {*}CREATE TABLE{*} statement specifies {*}STORED BY{*}, it should not also 
> specify {*}STORED AS{*}. The content in question is as follows.
> {code:bash}
> When STORED BY is specified, then row_format (DELIMITED or SERDE) and STORED 
> AS cannot be specified. Optional SERDEPROPERTIES can be specified as part of 
> the STORED BY clause and will be passed to the serde provided by the storage 
> handler.
> See CREATE TABLE and Row Format, Storage Format, and SerDe for more 
> information.
> Example:
> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "cf:string",
> "hbase.table.name" = "hbase_table_0"
> );
> {code}
> - This is similarly reflected in the documentation at 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL , where 
> {*}|{*} separates {*}STORED BY{*} from {*}STORED AS{*}, indicating their 
> distinct usage and mutual exclusivity.
> {code:bash}
> [
>[ROW FORMAT row_format] 
>[STORED AS file_format]
>  | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
> -- (Note: Available in Hive 0.6.0 and later)
> ]
> {code}
> - However, this contradicts the information provided in the Hive-Iceberg 
> Integration documentation at 
> https://cwiki.apache.org/confluence/display/Hive/Hive-Iceberg+Integration , 
> which explicitly gives examples demonstrating that {*}STORED BY{*} can 
> coexist with {*}STORED AS{*}. This creates an ambiguous interpretation. 
> {code:bash}
> The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. 
> The default file format is Parquet. The file format can be explicitily 
> provided by using STORED AS  while creating the table
> Example-1:
> CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC;
> {code}
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)