[ https://issues.apache.org/jira/browse/HIVE-24855?focusedWorklogId=570100&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-570100 ]
ASF GitHub Bot logged work on HIVE-24855: ----------------------------------------- Author: ASF GitHub Bot Created on: 22/Mar/21 22:19 Start Date: 22/Mar/21 22:19 Worklog Time Spent: 10m Work Description: jcamachor commented on a change in pull request #2046: URL: https://github.com/apache/hive/pull/2046#discussion_r599094572 ########## File path: llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapInputFormat.java ########## @@ -182,11 +190,11 @@ static VectorizedRowBatchCtx createFakeVrbCtx(MapWork mapWork) throws HiveExcept RowSchema rowSchema = findTsOp(mapWork).getSchema(); final List<String> colNames = new ArrayList<String>(rowSchema.getSignature().size()); final List<TypeInfo> colTypes = new ArrayList<TypeInfo>(rowSchema.getSignature().size()); - boolean hasRowId = false; + ArrayList<VirtualColumn> virtualColumnList = new ArrayList<>(2); for (ColumnInfo c : rowSchema.getSignature()) { String columnName = c.getInternalName(); - if (VirtualColumn.ROWID.getName().equals(columnName)) { - hasRowId = true; + if (ALLOWED_VIRTUAL_COLUMNS.containsKey(columnName)) { + virtualColumnList.add(ALLOWED_VIRTUAL_COLUMNS.get(columnName)); } else { if (VirtualColumn.VIRTUAL_COLUMN_NAMES.contains(columnName)) continue; Review comment: Not really part of this patch, but this line can be merge with previous one. ########## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ########## @@ -4450,6 +4450,8 @@ private static void populateLlapDaemonVarsSet(Set<String> llapDaemonVarsSetLocal HIVE_ACID_DIRECT_INSERT_ENABLED("hive.acid.direct.insert.enabled", true, "Enable writing the data files directly to the table's final destination instead of the staging directory." + "This optimization only applies on INSERT operations on ACID tables."), + HIVE_ACID_FETCH_DELETED_ROWS("hive.acid.fetch.deleted.rows", false, Review comment: I am not sure this should be part of HiveConf.java since it is not a global/session config per se? Have you considered introducing it in `org.apache.hadoop.hive.conf.Constants`? We have introduced some of these config variables there in the past. ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ########## @@ -1970,8 +1970,22 @@ private int processQueryHint(ASTNode ast, QBParseInfo qbp, int posn) throws Sema String queryHintStr = ast.getText(); LOG.debug("QUERY HINT: {} ", queryHintStr); try { - ASTNode hintNode = pd.parseHint(queryHintStr); - qbp.setHints(hintNode); + ASTNode hintListNode = pd.parseHint(queryHintStr); + qbp.setHints(hintListNode); + for (int i = 0; i < hintListNode.getChildCount(); ++i) { + ASTNode hintNode = (ASTNode) hintListNode.getChild(i); + if (hintNode.getChild(0).getType() != HintParser.TOK_FETCH_DELETED_ROWS) { Review comment: Should this be a table property rather than a hint, i.e., value for `hive.acid.fetch.deleted.rows` passed as a property for the table scan? That would use a similar pattern to what we do in other code paths in Hive, e.g., to push computation for Druid/JDBC. It also means we do not need to introduce a new hint so it will simplify the patch. ########## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ########## @@ -1943,15 +1943,17 @@ public static boolean isFullAcidScan(Configuration conf) { * we assume this is a full transactional table. */ public static void setAcidOperationalProperties( - Configuration conf, boolean isTxnTable, AcidOperationalProperties properties) { + Configuration conf, boolean isTxnTable, AcidOperationalProperties properties, boolean fetchDeletedRows) { Review comment: Since in most cases, we would pass false for the last parameter, should we keep the old method signature too, which would call this method with `false` value for the last parameter? ########## File path: ql/src/java/org/apache/hadoop/hive/ql/io/BatchToRowReader.java ########## @@ -108,25 +108,41 @@ public BatchToRowReader(RecordReader<NullWritable, VectorizedRowBatch> vrbReader } else { Arrays.fill(included, true); } - // Create struct for ROW__ID virtual column and extract index - this.rowIdIdx = vrbCtx.findVirtualColumnNum(VirtualColumn.ROWID); - if (this.rowIdIdx >= 0) { - included[rowIdIdx] = true; + + virtualColumnHandlers = requestedVirtualColumns(); + for (VirtualColumnHandler handler : virtualColumnHandlers) { + int idx = vrbCtx.findVirtualColumnNum(handler.virtualColumn); + if (idx >= 0) { + included[idx] = true; + handler.indexInSchema = idx; + } } + if (LOG.isDebugEnabled()) { LOG.debug("Including the columns " + DebugUtils.toString(included)); } this.included = included; } + public static class VirtualColumnHandler { Review comment: Can you add a short comment for the inner class? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 570100) Time Spent: 20m (was: 10m) > Introduce virtual colum ROW__IS__DELETED > ---------------------------------------- > > Key: HIVE-24855 > URL: https://issues.apache.org/jira/browse/HIVE-24855 > Project: Hive > Issue Type: New Feature > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)