[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-04 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=778392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-778392
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 04/Jun/22 07:04
Start Date: 04/Jun/22 07:04
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #3281:
URL: https://github.com/apache/hive/pull/3281




Issue Time Tracking
---

Worklog Id: (was: 778392)
Time Spent: 13h  (was: 12h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777339
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:22
Start Date: 02/Jun/22 08:22
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887629211


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Should we do transactional check for Materialised views?



##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777339)
Time Spent: 12h 50m  (was: 12h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777338
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 08:22
Start Date: 02/Jun/22 08:22
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887697843


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Since suffixing must be done for transactional MV, added a condition. 
Updated.





Issue Time Tracking
---

Worklog Id: (was: 777338)
Time Spent: 12h 40m  (was: 12.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777300&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777300
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:00
Start Date: 02/Jun/22 07:00
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887625663


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   Updated.



##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777300)
Time Spent: 11h 40m  (was: 11.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777307&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777307
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:04
Start Date: 02/Jun/22 07:04
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887629211


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   Should we do transactional check for Materialised views?





Issue Time Tracking
---

Worklog Id: (was: 777307)
Time Spent: 12.5h  (was: 12h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777305&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777305
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:03
Start Date: 02/Jun/22 07:03
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887627534


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777305)
Time Spent: 12h 10m  (was: 12h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777301&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777301
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:00
Start Date: 02/Jun/22 07:00
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887625817


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {
+String protoName = tblDesc.getDbTableName();
+String[] names = Utilities.getDbTableName(protoName);
+if (enableSuffixing) {
+  long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
+if (!db.databaseExists(names[0])) {
+  throw new SemanticException("ERROR: The database " + names[0] + " 
does not exist.");
+}
+
+Warehouse wh = new Warehouse(conf);
+location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
+  } else {
+location = new Path(tblDesc.getLocation());
+  }
+
+  // Handle table translation
+  // Property modifications of the table is handled later.
+  // We are interested in the location if it has changed
+  // due to table translation.
+  Table tbl = tblDesc.toTable(conf);
+  tbl = db.getTranslateTableDryrun(tbl.getTTable());

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777301)
Time Spent: 11h 50m  (was: 11h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777298&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777298
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 06:59
Start Date: 02/Jun/22 06:59
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887624982


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   Renamed to createTableOrMVUseSuffix. Updated.





Issue Time Tracking
---

Worklog Id: (was: 777298)
Time Spent: 11h 20m  (was: 11h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777306&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777306
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:03
Start Date: 02/Jun/22 07:03
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887628060


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())

Review Comment:
   Updated and added a new function called "getTableOrMVSuffix"





Issue Time Tracking
---

Worklog Id: (was: 777306)
Time Spent: 12h 20m  (was: 12h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777304
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 07:02
Start Date: 02/Jun/22 07:02
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887627260


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {

Review Comment:
   Location check happens now at the point where suffixing is decided. Updated.
   
https://github.com/apache/hive/pull/3281/files#diff-d4b1a32bbbd9e283893a6b52854c7aeb3e356a1ba1add2c4107e52901ca268f9R7616-R7618
   



##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777304)
Time Spent: 12h  (was: 11h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-06-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=777299&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-777299
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 02/Jun/22 06:59
Start Date: 02/Jun/22 06:59
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r887625106


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);

Review Comment:
   Updated.



##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8229,9 +8299,17 @@ private void handleLineage(LoadTableDesc ltd, Operator 
output)
   Path tlocation = null;
   String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1];
   try {
+String suffix = "";
+if (AcidUtils.isTransactionalTable(destinationTable)) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 777299)
Time Spent: 11.5h  (was: 11h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776329&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776329
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 13:50
Start Date: 31/May/22 13:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885665688


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8229,9 +8299,17 @@ private void handleLineage(LoadTableDesc ltd, Operator 
output)
   Path tlocation = null;
   String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1];
   try {
+String suffix = "";
+if (AcidUtils.isTransactionalTable(destinationTable)) {
+  boolean useSuffix = 
Boolean.getBoolean(destinationTable.getProperty(SOFT_DELETE_TABLE));

Review Comment:
   can we get a NullPointer when SOFT_DELETE_TABLE is not present?





Issue Time Tracking
---

Worklog Id: (was: 776329)
Time Spent: 11h 10m  (was: 11h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776241&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776241
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:58
Start Date: 31/May/22 10:58
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885497242


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())

Review Comment:
   could we extract this part into the helper method, it's repeating in 
multiple places 





Issue Time Tracking
---

Worklog Id: (was: 776241)
Time Spent: 11h  (was: 10h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776239&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776239
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:56
Start Date: 31/May/22 10:56
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885496096


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {
   long txnId = Optional.ofNullable(pCtx.getContext())
 .map(ctx -> ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
   suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);

Review Comment:
   shouldn't be populating suffix if txnId=0





Issue Time Tracking
---

Worklog Id: (was: 776239)
Time Spent: 10h 50m  (was: 10h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776237&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776237
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:53
Start Date: 31/May/22 10:53
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885492461


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   
   createTableUseSuffix &= 
AcidUtils.isTablePropertyTransactional(pCtx.getCreateViewDesc().getTblProps());
   





Issue Time Tracking
---

Worklog Id: (was: 776237)
Time Spent: 10h 40m  (was: 10.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776236&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776236
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:52
Start Date: 31/May/22 10:52
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885492461


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   
   createTableUseSuffix &=
  
AcidUtils.isTablePropertyTransactional(pCtx.getCreateViewDesc().getTblProps());
   





Issue Time Tracking
---

Worklog Id: (was: 776236)
Time Spent: 10.5h  (was: 10h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776235&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776235
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:52
Start Date: 31/May/22 10:52
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885492461


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
   } else if (pCtx.getQueryProperties().isMaterializedView()) {
 protoName = pCtx.getCreateViewDesc().getViewName();
-boolean createMVUseSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
-  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
-if (createMVUseSuffix) {
+if (useSuffix) {

Review Comment:
   createTableUseSuffix &=

AcidUtils.isTablePropertyTransactional(pCtx.getCreateViewDesc().getTblProps());





Issue Time Tracking
---

Worklog Id: (was: 776235)
Time Spent: 10h 20m  (was: 10h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776233&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776233
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:50
Start Date: 31/May/22 10:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885410160


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {

Review Comment:
   we should exclude managed non-txn tables as well:
   
   createTableUseSuffix &= 
AcidUtils.isTablePropertyTransactional(pCtx.getCreateTable().getTblProps());
   





Issue Time Tracking
---

Worklog Id: (was: 776233)
Time Spent: 10h 10m  (was: 10h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776230&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776230
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:47
Start Date: 31/May/22 10:47
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885488362


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   please rename to `createTableUseSuffix`





Issue Time Tracking
---

Worklog Id: (was: 776230)
Time Spent: 10h  (was: 9h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776229&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776229
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:46
Start Date: 31/May/22 10:46
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885487387


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {
+  long txnId = Optional.ofNullable(pCtx.getContext())
+  .map(ctx -> 
ctx.getHiveTxnManager().getCurrentTxnId()).orElse(0L);
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);

Review Comment:
   shouldn't be populating suffix if txnId=0





Issue Time Tracking
---

Worklog Id: (was: 776229)
Time Spent: 9h 50m  (was: 9h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776223&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776223
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:42
Start Date: 31/May/22 10:42
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885410160


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {

Review Comment:
   we should exclude managed non-txn tables as well, should be applied directly 
to the useSuffix
   
   AcidUtils.isTablePropertyTransactional(pCtx.getCreateTable().getTblProps())
   





Issue Time Tracking
---

Worklog Id: (was: 776223)
Time Spent: 9h 40m  (was: 9.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776221&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776221
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:39
Start Date: 31/May/22 10:39
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885472281


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   can't we handle suffix here:
   
   if (createTableUseSuffix) {
   long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
   suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId)
   
   destinationPath = new Path(destinationPath + suffix);
   tblDesc.getTblProps().put(SOFT_DELETE_TABLE, Boolean.TRUE.toString());
   }
   tblDesc.setLocation(destinationPath.toString());
   





Issue Time Tracking
---

Worklog Id: (was: 776221)
Time Spent: 9.5h  (was: 9h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776220&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776220
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:39
Start Date: 31/May/22 10:39
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885472281


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   can't we handle suffix here:
   
   if (createTableUseSuffix) {
   long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
   suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId)
   
   tblDesc.getTblProps().put(SOFT_DELETE_TABLE, Boolean.TRUE.toString());
   destinationPath = new Path(destinationPath + suffix);
   }
   tblDesc.setLocation(destinationPath.toString());
   





Issue Time Tracking
---

Worklog Id: (was: 776220)
Time Spent: 9h 20m  (was: 9h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776219&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776219
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:36
Start Date: 31/May/22 10:36
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885479005


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8229,9 +8299,17 @@ private void handleLineage(LoadTableDesc ltd, Operator 
output)
   Path tlocation = null;
   String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1];
   try {
+String suffix = "";
+if (AcidUtils.isTransactionalTable(destinationTable)) {

Review Comment:
   use AcidUtils.isTableSoftDeleteEnabled() it checks SOFT_DELETE_TABLE 
property as well





Issue Time Tracking
---

Worklog Id: (was: 776219)
Time Spent: 9h 10m  (was: 9h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776215&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776215
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:34
Start Date: 31/May/22 10:34
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885472281


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   can't we handle suffix here:
   
   if (createTableUseSuffix) {
   long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
   suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId)
   
   destinationPath = new Path(destinationPath + suffix);
   tblDesc.getTblProps().put(SOFT_DELETE_TABLE, Boolean.TRUE.toString());
   }
   tblDesc.setLocation(destinationPath.toString());
   





Issue Time Tracking
---

Worklog Id: (was: 776215)
Time Spent: 9h  (was: 8h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776214&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776214
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:33
Start Date: 31/May/22 10:33
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885476172


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   please rename to `createTableUseSuffix`





Issue Time Tracking
---

Worklog Id: (was: 776214)
Time Spent: 8h 50m  (was: 8h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776213&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776213
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:31
Start Date: 31/May/22 10:31
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885472281


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   can't we handle suffix here:
   
   if (enableSuffixing) {
   long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
   suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId)
   destinationPath = new Path(destinationPath + suffix);
   tblDesc.getTblProps().put(SOFT_DELETE_TABLE, Boolean.TRUE.toString());
   }
   tblDesc.setLocation(destinationPath.toString());
   





Issue Time Tracking
---

Worklog Id: (was: 776213)
Time Spent: 8h 40m  (was: 8.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776211&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776211
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:29
Start Date: 31/May/22 10:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885472281


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7598,6 +7602,26 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());
+// Property SOFT_DELETE_TABLE needs to be added to indicate that 
suffixing is used.
+if (enableSuffixing && tblDesc.getLocation().matches("(.*)" + 
SOFT_DELETE_TABLE_PATTERN)) {

Review Comment:
   can't we handle suffix here:
   
   if (enableSuffixing) {
   long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
   suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, txnId)
   destinationPath = new Path(destinationPath + suffix);
   tblDesc.getTblProps().put(SOFT_DELETE_TABLE, Boolean.TRUE.toString());
   }
   





Issue Time Tracking
---

Worklog Id: (was: 776211)
Time Spent: 8.5h  (was: 8h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776208&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776208
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 10:25
Start Date: 31/May/22 10:25
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885441938


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {
+String protoName = tblDesc.getDbTableName();
+String[] names = Utilities.getDbTableName(protoName);
+if (enableSuffixing) {
+  long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
+if (!db.databaseExists(names[0])) {
+  throw new SemanticException("ERROR: The database " + names[0] + " 
does not exist.");
+}
+
+Warehouse wh = new Warehouse(conf);
+location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
+  } else {
+location = new Path(tblDesc.getLocation());
+  }
+
+  // Handle table translation
+  // Property modifications of the table is handled later.
+  // We are interested in the location if it has changed
+  // due to table translation.
+  Table tbl = tblDesc.toTable(conf);
+  tbl = db.getTranslateTableDryrun(tbl.getTTable());

Review Comment:
   shouldn't we pass through the transformers first and do the location check 
after?
   
   tbl = db.getTranslateTableDryrun(tbl.getTTable());
   if (tbl.getSd().getLocation() == null
   || tbl.getSd().getLocation().isEmpty()) {
location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1], 
false);
   } else {
   location = wh.getDnsPath(new Path(tbl.getSd().getLocation()));
   }
   tbl.getSd().setLocation(location.toString());
    





Issue Time Tracking
---

Worklog Id: (was: 776208)
Time Spent: 8h 20m  (was: 8h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776195
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:59
Start Date: 31/May/22 09:59
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885441938


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {
+String protoName = tblDesc.getDbTableName();
+String[] names = Utilities.getDbTableName(protoName);
+if (enableSuffixing) {
+  long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
+if (!db.databaseExists(names[0])) {
+  throw new SemanticException("ERROR: The database " + names[0] + " 
does not exist.");
+}
+
+Warehouse wh = new Warehouse(conf);
+location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
+  } else {
+location = new Path(tblDesc.getLocation());
+  }
+
+  // Handle table translation
+  // Property modifications of the table is handled later.
+  // We are interested in the location if it has changed
+  // due to table translation.
+  Table tbl = tblDesc.toTable(conf);
+  tbl = db.getTranslateTableDryrun(tbl.getTTable());

Review Comment:
   shouldn't we pass through the transformers first and do the location check 
after?
   
   tbl = db.getTranslateTableDryrun(tbl.getTTable());
   if (tbl.getSd().getLocation() == null
   || tbl.getSd().getLocation().isEmpty()) {
location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1], 
false);
   } else {
   location = wh.getDnsPath(new Path(tbl.getSd().getLocation()));
   }
   //add the suffix here
   location = new Path(location + getTableSuffix(tbl));
   tbl.getSd().setLocation(location.toString());
    





Issue Time Tracking
---

Worklog Id: (was: 776195)
Time Spent: 8h 10m  (was: 8h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776194&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776194
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:58
Start Date: 31/May/22 09:58
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885441938


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {
+String protoName = tblDesc.getDbTableName();
+String[] names = Utilities.getDbTableName(protoName);
+if (enableSuffixing) {
+  long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
+if (!db.databaseExists(names[0])) {
+  throw new SemanticException("ERROR: The database " + names[0] + " 
does not exist.");
+}
+
+Warehouse wh = new Warehouse(conf);
+location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
+  } else {
+location = new Path(tblDesc.getLocation());
+  }
+
+  // Handle table translation
+  // Property modifications of the table is handled later.
+  // We are interested in the location if it has changed
+  // due to table translation.
+  Table tbl = tblDesc.toTable(conf);
+  tbl = db.getTranslateTableDryrun(tbl.getTTable());

Review Comment:
   shouldn't we pass through the transformers first and do the location check 
after?
   
   tbl = db.getTranslateTableDryrun(tbl.getTTable());
   if (tbl.getSd().getLocation() == null
   || tbl.getSd().getLocation().isEmpty()) {
location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1], 
false);
   } else {
   location = wh.getDnsPath(new Path(tbl.getSd().getLocation()));
   }
   location = new Path(location + getTableSuffix(tbl));
   tbl.getSd().setLocation(location.toString());
    





Issue Time Tracking
---

Worklog Id: (was: 776194)
Time Spent: 8h  (was: 7h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776192&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776192
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:55
Start Date: 31/May/22 09:55
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885441938


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {
+String protoName = tblDesc.getDbTableName();
+String[] names = Utilities.getDbTableName(protoName);
+if (enableSuffixing) {
+  long txnId = ctx.getHiveTxnManager().getCurrentTxnId();
+  suffix = SOFT_DELETE_PATH_SUFFIX + String.format(DELTA_DIGITS, 
txnId);
+}
+if (!db.databaseExists(names[0])) {
+  throw new SemanticException("ERROR: The database " + names[0] + " 
does not exist.");
+}
+
+Warehouse wh = new Warehouse(conf);
+location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
+  } else {
+location = new Path(tblDesc.getLocation());
+  }
+
+  // Handle table translation
+  // Property modifications of the table is handled later.
+  // We are interested in the location if it has changed
+  // due to table translation.
+  Table tbl = tblDesc.toTable(conf);
+  tbl = db.getTranslateTableDryrun(tbl.getTTable());

Review Comment:
   shouldn't we pass through the transformers first and do the location check 
after?
   
   tbl = db.getTranslateTableDryrun(tbl.getTTable());
   if (tbl.getSd().getLocation() == null
   || tbl.getSd().getLocation().isEmpty()) {
location = wh.getDefaultTablePath(db.getDatabase(names[0]), names[1] + 
suffix, false);
   } else {
   location = wh.getDnsPath(new Path(tbl.getSd().getLocation() + suffix));
   }
   tbl.getSd().setLocation(location.toString());
    





Issue Time Tracking
---

Worklog Id: (was: 776192)
Time Spent: 7h 50m  (was: 7h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776186&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776186
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:46
Start Date: 31/May/22 09:46
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885432422


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {
+Path location;
+String suffix = "";
+try {
+  // When location is specified, suffix is not added
+  if (tblDesc.getLocation() == null) {

Review Comment:
   should we check the empty location as well?





Issue Time Tracking
---

Worklog Id: (was: 776186)
Time Spent: 7h 40m  (was: 7.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776182
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:41
Start Date: 31/May/22 09:41
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885428121


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7940,6 +7970,46 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {

Review Comment:
   could we rename it to `getCtasLocation`, to follow same naming pattern 
`getDefaultCtasLocation`





Issue Time Tracking
---

Worklog Id: (was: 776182)
Time Spent: 7.5h  (was: 7h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776181&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776181
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:39
Start Date: 31/May/22 09:39
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885410160


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {

Review Comment:
   we should exclude managed non-txn tables as well, should be applied directly 
to the useSuffix
   
   AcidUtils.isTransactionalTable(tbl)
   





Issue Time Tracking
---

Worklog Id: (was: 776181)
Time Spent: 7h 20m  (was: 7h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=776168&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776168
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 31/May/22 09:24
Start Date: 31/May/22 09:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r885410160


##
ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java:
##
@@ -517,17 +517,21 @@ private Path getDefaultCtasLocation(final ParseContext 
pCtx) throws SemanticExce
 try {
   String protoName = null, suffix = "";
   boolean isExternal = false;
-  
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+
   if (pCtx.getQueryProperties().isCTAS()) {
 protoName = pCtx.getCreateTable().getDbTableName();
 isExternal = pCtx.getCreateTable().isExternal();
-  
+if (!isExternal && useSuffix) {

Review Comment:
   we should exclude managed non-txn tables as well
   
   AcidUtils.isTransactionalTable(tbl)
   





Issue Time Tracking
---

Worklog Id: (was: 776168)
Time Spent: 7h 10m  (was: 7h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775071&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775071
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 14:15
Start Date: 26/May/22 14:15
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882716593


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775071)
Time Spent: 7h  (was: 6h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775070&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775070
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 14:14
Start Date: 26/May/22 14:14
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882716324


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775070)
Time Spent: 6h 50m  (was: 6h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775049&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775049
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 13:04
Start Date: 26/May/22 13:04
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882647109


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java:
##
@@ -347,7 +347,13 @@ public static boolean isNonNativeTable(Table table) {
 if (table == null || table.getParameters() == null) {
   return false;
 }
-return 
(table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != null);
+return isNonNativeTable(table.getParameters());
+  }
+
+  public static boolean isNonNativeTable(Map tblProps) {
+return tblProps.get(
+
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_STORAGE)
+!= null;

Review Comment:
   I have missed that.
   Sorry.





Issue Time Tracking
---

Worklog Id: (was: 775049)
Time Spent: 6h 40m  (was: 6.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775048
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 13:02
Start Date: 26/May/22 13:02
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882646180


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775048)
Time Spent: 6.5h  (was: 6h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775047&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775047
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 13:02
Start Date: 26/May/22 13:02
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882645468


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java:
##
@@ -347,7 +347,13 @@ public static boolean isNonNativeTable(Table table) {
 if (table == null || table.getParameters() == null) {
   return false;
 }
-return 
(table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != null);
+return isNonNativeTable(table.getParameters());
+  }
+
+  public static boolean isNonNativeTable(Map tblProps) {
+return tblProps.get(
+
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_STORAGE)
+!= null;

Review Comment:
   The function is used here - 
   
https://github.com/apache/hive/pull/3281/files#diff-d4b1a32bbbd9e283893a6b52854c7aeb3e356a1ba1add2c4107e52901ca268f9R7599





Issue Time Tracking
---

Worklog Id: (was: 775047)
Time Spent: 6h 20m  (was: 6h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775044&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775044
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 13:01
Start Date: 26/May/22 13:01
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882645082


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775044)
Time Spent: 6h 10m  (was: 6h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775043&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775043
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 12:59
Start Date: 26/May/22 12:59
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882643432


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775043)
Time Spent: 6h  (was: 5h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775040&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775040
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 12:58
Start Date: 26/May/22 12:58
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882642371


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java:
##
@@ -347,7 +347,13 @@ public static boolean isNonNativeTable(Table table) {
 if (table == null || table.getParameters() == null) {
   return false;
 }
-return 
(table.getParameters().get(hive_metastoreConstants.META_TABLE_STORAGE) != null);
+return isNonNativeTable(table.getParameters());
+  }
+
+  public static boolean isNonNativeTable(Map tblProps) {
+return tblProps.get(
+
org.apache.hadoop.hive.metastore.api.hive_metastoreConstants.META_TABLE_STORAGE)
+!= null;

Review Comment:
   Why do we need this change?





Issue Time Tracking
---

Worklog Id: (was: 775040)
Time Spent: 5h 50m  (was: 5h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775038&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775038
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 12:56
Start Date: 26/May/22 12:56
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882641114


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775038)
Time Spent: 5h 40m  (was: 5.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775037&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775037
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 12:56
Start Date: 26/May/22 12:56
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882640778


##
ql/src/test/queries/clientpositive/ctas_direct_with_specified_locations.q:
##
@@ -0,0 +1,116 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775037)
Time Spent: 5.5h  (was: 5h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=775034&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775034
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 12:55
Start Date: 26/May/22 12:55
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882640206


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,118 @@
+

Issue Time Tracking
---

Worklog Id: (was: 775034)
Time Spent: 5h 20m  (was: 5h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=774965&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774965
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 26/May/22 10:26
Start Date: 26/May/22 10:26
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r882532032


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || 
conf.getBoolVar(ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());

Review Comment:
   I have added the check here which adds the SOFT_DELETE_TABLE property when 
the location is suffixed.
   
https://github.com/apache/hive/pull/3281/files#diff-d4b1a32bbbd9e283893a6b52854c7aeb3e356a1ba1add2c4107e52901ca268f9R7614-R7615





Issue Time Tracking
---

Worklog Id: (was: 774965)
Time Spent: 5h 10m  (was: 5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=774417&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774417
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 25/May/22 08:31
Start Date: 25/May/22 08:31
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r881374106


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,94 @@
+

Issue Time Tracking
---

Worklog Id: (was: 774417)
Time Spent: 5h  (was: 4h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=774415&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774415
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 25/May/22 08:31
Start Date: 25/May/22 08:31
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r881373707


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8223,9 +8286,27 @@ private void handleLineage(LoadTableDesc ltd, Operator 
output)
   Path tlocation = null;
   String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1];
   try {
+String suffix = "";
+if (!tableDesc.isExternal()) {
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 774415)
Time Spent: 4h 50m  (was: 4h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=774414&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774414
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 25/May/22 08:30
Start Date: 25/May/22 08:30
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r881372627


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 774414)
Time Spent: 4h 40m  (was: 4.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=774363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774363
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 25/May/22 05:03
Start Date: 25/May/22 05:03
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r881219364


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7934,6 +7958,45 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {

Review Comment:
   Create table flow happens at the DDLTask which is after the query is 
analyzed, compiled & data is written in CTAS AFAIK. Hence this path is set 
before the create table flow. This path is used by the FileSinkOperators to 
write the data.





Issue Time Tracking
---

Worklog Id: (was: 774363)
Time Spent: 4.5h  (was: 4h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773865&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773865
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 24/May/22 04:23
Start Date: 24/May/22 04:23
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r880044731


##
ql/src/test/queries/clientpositive/ctas_direct_with_specified_locations.q:
##
@@ -0,0 +1,92 @@
+

Issue Time Tracking
---

Worklog Id: (was: 773865)
Time Spent: 4h 20m  (was: 4h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773864&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773864
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 24/May/22 04:23
Start Date: 24/May/22 04:23
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r880044433


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,94 @@
+

Issue Time Tracking
---

Worklog Id: (was: 773864)
Time Spent: 4h 10m  (was: 4h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773778
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:12
Start Date: 23/May/22 22:12
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879904618


##
ql/src/test/queries/clientpositive/ctas_direct_with_specified_locations.q:
##
@@ -0,0 +1,92 @@
+

Issue Time Tracking
---

Worklog Id: (was: 773778)
Time Spent: 4h  (was: 3h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773777&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773777
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:11
Start Date: 23/May/22 22:11
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879904223


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,94 @@
+

Issue Time Tracking
---

Worklog Id: (was: 773777)
Time Spent: 3h 50m  (was: 3h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773773&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773773
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:07
Start Date: 23/May/22 22:07
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879902533


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,94 @@
+

Issue Time Tracking
---

Worklog Id: (was: 773773)
Time Spent: 3h 40m  (was: 3.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773768&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773768
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 22:06
Start Date: 23/May/22 22:06
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879902026


##
ql/src/test/queries/clientpositive/ctas_direct.q:
##
@@ -0,0 +1,94 @@
+

Issue Time Tracking
---

Worklog Id: (was: 773768)
Time Spent: 3.5h  (was: 3h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773326
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:46
Start Date: 23/May/22 07:46
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879122398


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7934,6 +7958,45 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 return output;
   }
 
+  private Path getCTASDestinationTableLocation(CreateTableDesc tblDesc, 
boolean enableSuffixing) throws SemanticException {

Review Comment:
   this part looks similar to what is done in create table. Did we set CTAS 
path here before or we went through the create table flow?





Issue Time Tracking
---

Worklog Id: (was: 773326)
Time Spent: 3h 20m  (was: 3h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773325&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773325
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:43
Start Date: 23/May/22 07:43
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879120122


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -8223,9 +8286,27 @@ private void handleLineage(LoadTableDesc ltd, Operator 
output)
   Path tlocation = null;
   String tName = Utilities.getDbTableName(tableDesc.getDbTableName())[1];
   try {
+String suffix = "";
+if (!tableDesc.isExternal()) {
+  boolean useSuffix = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   you shouldn't be relying on configs here, but the SOFT_DELETE_TABLE prop 
value





Issue Time Tracking
---

Worklog Id: (was: 773325)
Time Spent: 3h 10m  (was: 3h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773322&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773322
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:32
Start Date: 23/May/22 07:32
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879110909


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7601,13 +7619,14 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 } catch (LockException ex) {
   throw new SemanticException("Failed to allocate write Id", ex);
 }
-if (AcidUtils.isInsertOnlyTable(tblProps, true)) {
-  isMmTable = isMmCreate = true;
+if (isMmTable) {
   if (tblDesc != null) {
-tblDesc.setInitialMmWriteId(writeId);
+tblDesc.setInitialWriteId(writeId);
   } else {
 viewDesc.setInitialMmWriteId(writeId);
   }
+} else if (isDirectInsert) {
+  tblDesc.setInitialWriteId(writeId);

Review Comment:
   can we enter here when creating a view?





Issue Time Tracking
---

Worklog Id: (was: 773322)
Time Spent: 3h  (was: 2h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773321&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773321
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:29
Start Date: 23/May/22 07:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879107738


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || 
conf.getBoolVar(ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+destinationPath = getCTASDestinationTableLocation(tblDesc, 
enableSuffixing);
+// Setting the location so that metadata transformers
+// does not change the location later while creating the table.
+tblDesc.setLocation(destinationPath.toString());

Review Comment:
   please check that during create SOFT_DELETE_TABLE prop is set, there is an 
if that skips this setter in case of manually set location
   
   if (createTableUseSuffix) {
   tbl.setProperty(SOFT_DELETE_TABLE, Boolean.TRUE.toString());
 }
   





Issue Time Tracking
---

Worklog Id: (was: 773321)
Time Spent: 2h 50m  (was: 2h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773317&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773317
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:25
Start Date: 23/May/22 07:25
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879104158


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   nit, please use HiveConf getter:
   
   HiveConf.getBoolVar(conf, ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
   





Issue Time Tracking
---

Worklog Id: (was: 773317)
Time Spent: 2h 40m  (was: 2.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=773316&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773316
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 23/May/22 07:25
Start Date: 23/May/22 07:25
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r879104158


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = MetaStoreUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+isMmTable = isMmCreate = AcidUtils.isInsertOnlyTable(tblProps);
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)

Review Comment:
   nit, please use HiveConf setter:
   
   HiveConf.setBoolVar(conf, ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
   





Issue Time Tracking
---

Worklog Id: (was: 773316)
Time Spent: 2.5h  (was: 2h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=772403&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772403
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 19/May/22 11:38
Start Date: 19/May/22 11:38
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r876944639


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = AcidUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+if (AcidUtils.isInsertOnlyTable(tblProps, true)) {
+  isMmTable = isMmCreate = true;
+}
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || 
conf.getBoolVar(ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+String location = tblDesc.getLocation();
+destinationPath = location == null ? 
getCTASDestinationTableLocation(tblDesc, enableSuffixing) : new Path(location);

Review Comment:
   @pvary Thanks for pointing this out. I have updated the patch to handle the 
use of MetadataTransformer. Please check and let me know if there are any 
issues with it.





Issue Time Tracking
---

Worklog Id: (was: 772403)
Time Spent: 2h 20m  (was: 2h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=772401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772401
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 19/May/22 11:37
Start Date: 19/May/22 11:37
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r876944852


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = AcidUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+if (AcidUtils.isInsertOnlyTable(tblProps, true)) {

Review Comment:
   Updated.



##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   Updated.





Issue Time Tracking
---

Worklog Id: (was: 772401)
Time Spent: 2h 10m  (was: 2h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=772397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772397
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 19/May/22 11:34
Start Date: 19/May/22 11:34
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r876944639


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = AcidUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+if (AcidUtils.isInsertOnlyTable(tblProps, true)) {
+  isMmTable = isMmCreate = true;
+}
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || 
conf.getBoolVar(ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+String location = tblDesc.getLocation();
+destinationPath = location == null ? 
getCTASDestinationTableLocation(tblDesc, enableSuffixing) : new Path(location);

Review Comment:
   @pvary Thanks for pointing this out. I have edited the patch to handle the 
use of MetadataTransformer. Please check and let me know if there are any 
issues with it.





Issue Time Tracking
---

Worklog Id: (was: 772397)
Time Spent: 1h 40m  (was: 1.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=772398&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772398
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 19/May/22 11:34
Start Date: 19/May/22 11:34
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r876944852


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = AcidUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+if (AcidUtils.isInsertOnlyTable(tblProps, true)) {

Review Comment:
   Done.





Issue Time Tracking
---

Worklog Id: (was: 772398)
Time Spent: 1h 50m  (was: 1h 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=772400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772400
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 19/May/22 11:34
Start Date: 19/May/22 11:34
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r876943222


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   Done.





Issue Time Tracking
---

Worklog Id: (was: 772400)
Time Spent: 2h  (was: 1h 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=772396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-772396
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 19/May/22 11:32
Start Date: 19/May/22 11:32
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r876943222


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   Done





Issue Time Tracking
---

Worklog Id: (was: 772396)
Time Spent: 1.5h  (was: 1h 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=771428&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771428
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 17/May/22 15:20
Start Date: 17/May/22 15:20
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r874957184


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = AcidUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+if (AcidUtils.isInsertOnlyTable(tblProps, true)) {

Review Comment:
   can't we simply use?
   
   isMmTable = AcidUtils.isInsertOnlyTable(destinationTable.getParameters());
   





Issue Time Tracking
---

Worklog Id: (was: 771428)
Time Spent: 1h 20m  (was: 1h 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=771425&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-771425
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 17/May/22 15:18
Start Date: 17/May/22 15:18
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r874954703


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   could we create in MetaStoreUtils method that accepts Map 
tblProps and reuse it in the original one?






Issue Time Tracking
---

Worklog Id: (was: 771425)
Time Spent: 1h 10m  (was: 1h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=770733&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-770733
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 16/May/22 09:39
Start Date: 16/May/22 09:39
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r873524614


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -7592,6 +7594,22 @@ protected Operator genFileSinkPlan(String dest, QB qb, 
Operator input)
 
   destTableIsTransactional = tblProps != null && 
AcidUtils.isTablePropertyTransactional(tblProps);
   if (destTableIsTransactional) {
+isNonNativeTable = AcidUtils.isNonNativeTable(tblProps);
+boolean isCtas = tblDesc != null && tblDesc.isCTAS();
+if (AcidUtils.isInsertOnlyTable(tblProps, true)) {
+  isMmTable = isMmCreate = true;
+}
+if (!isNonNativeTable && !destTableIsTemporary && isCtas) {
+  destTableIsFullAcid = AcidUtils.isFullAcidTable(tblProps);
+  acidOperation = getAcidType(dest);
+  isDirectInsert = isDirectInsert(destTableIsFullAcid, acidOperation);
+  boolean enableSuffixing = 
conf.getBoolVar(ConfVars.HIVE_ACID_CREATE_TABLE_USE_SUFFIX)
+  || 
conf.getBoolVar(ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
+  if (isDirectInsert || isMmTable) {
+String location = tblDesc.getLocation();
+destinationPath = location == null ? 
getCTASDestinationTableLocation(tblDesc, enableSuffixing) : new Path(location);

Review Comment:
   How would this handle the `MetadataTransformer`s? These Transformers could 
alter the table location (external/managed directory changes, etc)





Issue Time Tracking
---

Worklog Id: (was: 770733)
Time Spent: 1h  (was: 50m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> CTAS on transactional tables currently does a copy from staging location to 
> table location. This can be avoided by using Direct Insert semantics. Added 
> support for suffixed table locations as well.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=769080&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769080
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 11/May/22 13:53
Start Date: 11/May/22 13:53
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r870324421


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   Yes correct. I will use it.





Issue Time Tracking
---

Worklog Id: (was: 769080)
Time Spent: 50m  (was: 40m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> CTAS currently does a copy from staging location to table location. This can 
> be avoided by using Direct Insert semantics. However the table location must 
> be suffixed with the transaction identifier so that if the data is not 
> committed then this location will not be used by other tables.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=769079&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769079
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 11/May/22 13:52
Start Date: 11/May/22 13:52
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r870330328


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   MetaStoreUtils.isNonNativeTable() uses Table object as argument and I needed 
a util method with Properties of table since Table object is created at a later 
stage and properties are available before table creation in CTAS.
   





Issue Time Tracking
---

Worklog Id: (was: 769079)
Time Spent: 40m  (was: 0.5h)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> CTAS currently does a copy from staging location to table location. This can 
> be avoided by using Direct Insert semantics. However the table location must 
> be suffixed with the transaction identifier so that if the data is not 
> committed then this location will not be used by other tables.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=769072&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769072
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 11/May/22 13:47
Start Date: 11/May/22 13:47
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r870324421


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   Yes correct. I will use it.





Issue Time Tracking
---

Worklog Id: (was: 769072)
Time Spent: 0.5h  (was: 20m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> CTAS currently does a copy from staging location to table location. This can 
> be avoided by using Direct Insert semantics. However the table location must 
> be suffixed with the transaction identifier so that if the data is not 
> committed then this location will not be used by other tables.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=769070&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769070
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 11/May/22 13:41
Start Date: 11/May/22 13:41
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3281:
URL: https://github.com/apache/hive/pull/3281#discussion_r870317389


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -2876,6 +2876,12 @@ private static boolean isLockableTable(Table t) {
 }
   }
 
+  public static boolean isNonNativeTable(Map tblProps) {

Review Comment:
   same function is already available: MetaStoreUtils.isNonNativeTable(), can't 
we use it? 





Issue Time Tracking
---

Worklog Id: (was: 769070)
Time Spent: 20m  (was: 10m)

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CTAS currently does a copy from staging location to table location. This can 
> be avoided by using Direct Insert semantics. However the table location must 
> be suffixed with the transaction identifier so that if the data is not 
> committed then this location will not be used by other tables.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26217) Make CTAS use Direct Insert Semantics

2022-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26217?focusedWorklogId=769058&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769058
 ]

ASF GitHub Bot logged work on HIVE-26217:
-

Author: ASF GitHub Bot
Created on: 11/May/22 13:18
Start Date: 11/May/22 13:18
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya opened a new pull request, #3281:
URL: https://github.com/apache/hive/pull/3281

   
   
   ### What changes were proposed in this pull request?
   Making CTAS use Direct Insert semantics to a suffixed table location.
   
   
   
   
   ### Why are the changes needed?
   To improve performance of CTAS query.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   QTests
   
   




Issue Time Tracking
---

Worklog Id: (was: 769058)
Remaining Estimate: 0h
Time Spent: 10m

> Make CTAS use Direct Insert Semantics
> -
>
> Key: HIVE-26217
> URL: https://issues.apache.org/jira/browse/HIVE-26217
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CTAS currently does a copy from staging location to table location. This can 
> be avoided by using Direct Insert semantics. However the table location must 
> be suffixed with the transaction identifier so that if the data is not 
> committed then this location will not be used by other tables.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)