[GitHub] [hive] kgyrtkirk commented on a change in pull request #1341: HIVE-23959 bulk stat removal
kgyrtkirk commented on a change in pull request #1341: URL: https://github.com/apache/hive/pull/1341#discussion_r463435574 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java ## @@ -9483,6 +9483,81 @@ private void dropPartitionColumnStatisticsNoTxn( queryWithParams.getLeft().closeAll(); } + @Override + public void deleteAllPartitionColumnStatistics(TableName tn, String writeIdList) { + +String catName = tn.getCat(); +String dbName = tn.getDb(); +String tableName = tn.getTable(); + +Query query = null; +dbName = org.apache.commons.lang3.StringUtils.defaultString(dbName, Warehouse.DEFAULT_DATABASE_NAME); +catName = normalizeIdentifier(catName); +if (tableName == null) { + throw new RuntimeException("Table name is null."); +} +boolean ret = false; +try { + openTransaction(); + MTable mTable = getMTable(catName, dbName, tableName); + + query = pm.newQuery(MPartitionColumnStatistics.class); + + Object engine = null; Review comment: oh..sorry; I plan to disable this option by default - and leave only a targeted test to cover for it but I wanted to make a sanity check that this will work in case all tests are executed (and it does) I should have marked it as wip or something..I'll clean up things like this in the next version. thank you for taking a look! I was testing with psql so far but I'll check mysql as well! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kasakrisz commented on a change in pull request #1324: HIVE-23939: SharedWorkOptimizer: take the union of columns in mergeable TableScans
kasakrisz commented on a change in pull request #1324: URL: https://github.com/apache/hive/pull/1324#discussion_r463430942 ## File path: ql/src/test/results/clientpositive/llap/auto_join_reordering_values.q.out ## @@ -144,122 +144,30 @@ STAGE PLANS: tag: 0 value expressions: _col0 (type: int), _col2 (type: int), _col3 (type: int) auto parallelism: true -Execution mode: vectorized, llap -LLAP IO: no inputs -Path -> Alias: - A masked pattern was here -Path -> Partition: - A masked pattern was here -Partition - base file name: orderpayment_small - input format: org.apache.hadoop.mapred.TextInputFormat - output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat - properties: -bucket_count -1 -bucketing_version 2 -column.name.delimiter , -columns dealid,date,time,cityid,userid -columns.types int:string:string:int:int - A masked pattern was here -name default.orderpayment_small -serialization.format 1 -serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - -input format: org.apache.hadoop.mapred.TextInputFormat -output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat -properties: - bucketing_version 2 - column.name.delimiter , - columns dealid,date,time,cityid,userid - columns.comments - columns.types int:string:string:int:int - A masked pattern was here - name default.orderpayment_small - serialization.format 1 - serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe -serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe -name: default.orderpayment_small - name: default.orderpayment_small -Truncated Path -> Alias: - /orderpayment_small [orderpayment] -Map 6 -Map Operator Tree: -TableScan - alias: dim_pay_date - filterExpr: date is not null (type: boolean) - Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE - GatherStats: false Filter Operator isSamplingPred: false -predicate: date is not null (type: boolean) -Statistics: Num rows: 1 Data size: 94 Basic stats: COMPLETE Column stats: COMPLETE +predicate: dealid is not null (type: boolean) +Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: COMPLETE Select Operator - expressions: date (type: string) Review comment: The TS pulls the column `date` only was not merged due to `validPreConditions` failed: ``` 2020-07-30T23:20:32,796 DEBUG [a51ac124-5fb2-43bc-a0fc-f27099762584 main] optimizer.SharedWorkOptimizer: After SharedWorkSJOptimizer: TS[0]-FIL[44]-SEL[2]-RS[15]-MERGEJOIN[89]-RS[18]-MERGEJOIN[90]-RS[21]-MERGEJOIN[91]-RS[24]-MERGEJOIN[92]-SEL[27]-LIM[28]-FS[29] TS[3]-FIL[45]-SEL[5]-RS[16]-MERGEJOIN[89] TS[6]-FIL[46]-SEL[8]-RS[19]-MERGEJOIN[90] TS[9]-FIL[47]-SEL[11]-RS[22]-MERGEJOIN[91] TS[12]-FIL[48]-SEL[14]-RS[25]-MERGEJOIN[92] ``` Both has the same output works: ``` TS[0] alias = "orderpayment" dbName = "default" tableName = "orderpayment_small" neededColumns = {ArrayList@24085} size = 4 0 = "dealid" 1 = "date" 2 = "cityid" 3 = "userid" outputWorksOps1 = {HashSet@24017} size = 2 0 = {ReduceSinkOperator@24028} "RS[18]" 1 = {CommonMergeJoinOperator@24029} "MERGEJOIN[89]" ``` ``` TS[3] alias = "dim_pay_date" dbName = "default" tableName = "orderpayment_small" neededColumns = {ArrayList@23791} size = 1 0 = "date" outputWorksOps2 = {HashSet@24022} size = 2 0 = {ReduceSinkOperator@24028} "RS[18]" 1 = {CommonMergeJoinOperator@24029} "MERGEJOIN[89]" ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hive] kasakrisz commented on a change in pull request #1324: HIVE-23939: SharedWorkOptimizer: take the union of columns in mergeable TableScans
kasakrisz commented on a change in pull request #1324: URL: https://github.com/apache/hive/pull/1324#discussion_r463415834 ## File path: ql/src/test/results/clientpositive/llap/annotate_stats_join_pkfk.q.out ## @@ -1191,14 +1191,6 @@ STAGE PLANS: sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE -Execution mode: vectorized, llap Review comment: I checked this with the debugger: ``` TS[6] dbName = "default" tableName = "store_n0" neededColumns = {ArrayList@24024} size = 1 0 = "s_store_sk" TS[3] dbName = "default" tableName = "store_n0" neededColumns = {ArrayList@24053} size = 2 0 = "s_store_sk" 1 = "s_floor_space" ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463406580 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -1574,45 +1576,46 @@ public int compareTo(CompressedOwid other) { this.orcSplit = orcSplit; try { -final Path[] deleteDeltaDirs = getDeleteDeltaDirsFromSplit(orcSplit); -if (deleteDeltaDirs.length > 0) { +if (orcSplit.getDeltas().size() > 0) { AcidOutputFormat.Options orcSplitMinMaxWriteIds = AcidUtils.parseBaseOrDeltaBucketFilename(orcSplit.getPath(), conf); int totalDeleteEventCount = 0; - for (Path deleteDeltaDir : deleteDeltaDirs) { -if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { - continue; -} -Path[] deleteDeltaFiles = OrcRawRecordMerger.getDeltaFiles(deleteDeltaDir, bucket, -new OrcRawRecordMerger.Options().isCompacting(false), null); -for (Path deleteDeltaFile : deleteDeltaFiles) { - try { -ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag); -OrcTail orcTail = readerData.orcTail; -if (orcTail.getFooter().getNumberOfRows() <= 0) { - continue; // just a safe check to ensure that we are not reading empty delete files. -} -OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); -if (!deleteKeyInterval.isIntersects(keyInterval)) { - // If there is no intersection between data and delete delta, do not read delete file - continue; -} -// Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. -// For LLAP cases we need to create it here. -Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : -OrcFile.createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); -totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); -DeleteReaderValue deleteReaderValue = new DeleteReaderValue(deleteDeltaReader, -deleteDeltaFile, readerOptions, bucket, validWriteIdList, isBucketedTable, conf, -keyInterval, orcSplit); -DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); -if (deleteReaderValue.next(deleteRecordKey)) { - sortMerger.put(deleteRecordKey, deleteReaderValue); -} else { - deleteReaderValue.close(); + for (AcidInputFormat.DeltaMetaData deltaMetaData : orcSplit.getDeltas()) { +for (Path deleteDeltaDir : deltaMetaData.getPaths(orcSplit.getRootDir())) { + if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { +LOG.debug("Skipping delete delta dir {}", deleteDeltaDir); +continue; + } + for (AcidInputFormat.DeltaFileMetaData fileMetaData : deltaMetaData.getDeltaFiles()) { +Path deleteDeltaFile = fileMetaData.getPath(deleteDeltaDir, bucket); +try { + ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag, fileMetaData.getFileId(deleteDeltaDir, bucket)); + OrcTail orcTail = readerData.orcTail; + if (orcTail.getFooter().getNumberOfRows() <= 0) { +continue; // just a safe check to ensure that we are not reading empty delete files. + } + OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); + if (!deleteKeyInterval.isIntersects(keyInterval)) { +// If there is no intersection between data and delete delta, do not read delete file +continue; + } + // Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. + // For LLAP cases we need to create it here. + Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : OrcFile + .createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); + totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); + DeleteReaderValue deleteReaderValue = + new DeleteReaderValue(deleteDeltaReader, deleteDeltaFile, readerOptions, bucket, validWriteIdList, + isBucketedTable, conf, keyInterval, orcSplit); + DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); + if (deleteReaderValue.next(deleteRecordKey)
[GitHub] [hive] pvary commented on a change in pull request #1341: HIVE-23959 bulk stat removal
pvary commented on a change in pull request #1341: URL: https://github.com/apache/hive/pull/1341#discussion_r463405230 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java ## @@ -9483,6 +9483,81 @@ private void dropPartitionColumnStatisticsNoTxn( queryWithParams.getLeft().closeAll(); } + @Override + public void deleteAllPartitionColumnStatistics(TableName tn, String writeIdList) { + +String catName = tn.getCat(); +String dbName = tn.getDb(); +String tableName = tn.getTable(); + +Query query = null; +dbName = org.apache.commons.lang3.StringUtils.defaultString(dbName, Warehouse.DEFAULT_DATABASE_NAME); +catName = normalizeIdentifier(catName); +if (tableName == null) { + throw new RuntimeException("Table name is null."); +} +boolean ret = false; +try { + openTransaction(); + MTable mTable = getMTable(catName, dbName, tableName); + + query = pm.newQuery(MPartitionColumnStatistics.class); + + Object engine = null; Review comment: This seems strange. Setting this to null and one line below checking This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1341: HIVE-23959 bulk stat removal
pvary commented on a change in pull request #1341: URL: https://github.com/apache/hive/pull/1341#discussion_r463404741 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java ## @@ -2960,4 +2977,22 @@ public void lockDbTable(String tableName) throws MetaException { throw new MetaException("Error while locking table " + tableName + ": " + sqle.getMessage()); } } + + public void deleteColumnStatsState(long tbl_id) throws MetaException { +// @formatter:off +String queryText = "" ++ "delete from " + PARTITION_PARAMS + " pp" ++ " using " + PARTITIONS + " p " ++ " where " ++ " p.\"PART_ID = pp.\"PART_ID\"" Review comment: Please carefully check every ". Postgres and MySql are very problematic This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1341: HIVE-23959 bulk stat removal
pvary commented on a change in pull request #1341: URL: https://github.com/apache/hive/pull/1341#discussion_r463404254 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java ## @@ -984,16 +984,23 @@ private boolean isViewTable(String catName, String dbName, String tblName) throw part.setCatName(catName); part.setDbName(dbName); part.setTableName(tblName); - if (fields[4] != null) part.setCreateTime(MetastoreDirectSqlUtils.extractSqlInt(fields[4])); - if (fields[5] != null) part.setLastAccessTime(MetastoreDirectSqlUtils.extractSqlInt(fields[5])); + if (fields[4] != null) { +part.setCreateTime(MetastoreDirectSqlUtils.extractSqlInt(fields[4])); + } + if (fields[5] != null) { + part.setLastAccessTime(MetastoreDirectSqlUtils.extractSqlInt(fields[5])); + } Long writeId = MetastoreDirectSqlUtils.extractSqlLong(fields[14]); if (writeId != null) { part.setWriteId(writeId); } partitions.put(partitionId, part); - if (sdId == null) continue; // Probably a view. + if (sdId == null) Review comment: Nit: formatting This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] aasha opened a new pull request #1345: HIVE-23961 : Enable external table replication by default
aasha opened a new pull request #1345: URL: https://github.com/apache/hive/pull/1345 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] xiaomengzhang commented on pull request #1344: HIVE-23962 Make bin/hive pick user defined jdbc url
xiaomengzhang commented on pull request #1344: URL: https://github.com/apache/hive/pull/1344#issuecomment-666894541 @nrg4878 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] xiaomengzhang opened a new pull request #1344: HIVE-23962 Make bin/hive pick user defined jdbc url
xiaomengzhang opened a new pull request #1344: URL: https://github.com/apache/hive/pull/1344 Add an env variable BEELINE_URL_LEGACY, when this value is not empty, run "beeline -c $BEELINE_URL_LEGACY". Change-Id: I9bd7f34612e2bd246d5cba37ddfd8a582038bbdd ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] jcamachor commented on a change in pull request #1324: HIVE-23939: SharedWorkOptimizer: take the union of columns in mergeable TableScans
jcamachor commented on a change in pull request #1324: URL: https://github.com/apache/hive/pull/1324#discussion_r463040481 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java ## @@ -247,6 +249,18 @@ public ParseContext transform(ParseContext pctx) throws SemanticException { } } +if (LOG.isDebugEnabled()) { Review comment: Can we move the new call before the `if(pctx.getConf().getBoolVar(ConfVars.HIVE_SHARED_WORK_REUSE_MAPJOIN_CACHE)) {` block? It makes sense to trigger that block at the very end in case we continue adding phases. ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java ## @@ -247,6 +249,18 @@ public ParseContext transform(ParseContext pctx) throws SemanticException { } } +if (LOG.isDebugEnabled()) { + LOG.debug("Before SharedWorkOptimizer #2:\n" + Operator.toString(pctx.getTopOps().values())); +} + +// Execute shared work optimization +new BaseSharedWorkOptimizer().sharedWorkOptimization( Review comment: Can we put this additional step under a new flag (`true` by default)? ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java ## @@ -273,258 +287,332 @@ public ParseContext transform(ParseContext pctx) throws SemanticException { return pctx; } - private static boolean sharedWorkOptimization(ParseContext pctx, SharedWorkOptimizerCache optimizerCache, - ArrayListMultimap tableNameToOps, List> sortedTables, - boolean removeSemijoin) throws SemanticException { -// Boolean to keep track of whether this method actually merged any TS operators -boolean mergedExecuted = false; - -Multimap existingOps = ArrayListMultimap.create(); -Set> removedOps = new HashSet<>(); -for (Entry tablePair : sortedTables) { - String tableName = tablePair.getKey(); - for (TableScanOperator discardableTsOp : tableNameToOps.get(tableName)) { -if (removedOps.contains(discardableTsOp)) { - LOG.debug("Skip {} as it has already been removed", discardableTsOp); - continue; -} -Collection prevTsOps = existingOps.get(tableName); -for (TableScanOperator retainableTsOp : prevTsOps) { - if (removedOps.contains(retainableTsOp)) { -LOG.debug("Skip {} as it has already been removed", retainableTsOp); + private static class BaseSharedWorkOptimizer { Review comment: Can we add a clarifying comment to the new internal classes with the difference between both of them? ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java ## @@ -247,6 +249,18 @@ public ParseContext transform(ParseContext pctx) throws SemanticException { } } +if (LOG.isDebugEnabled()) { + LOG.debug("Before SharedWorkOptimizer #2:\n" + Operator.toString(pctx.getTopOps().values())); Review comment: This is the same as `After SharedWorkSJOptimizer`, no need to print it again. ## File path: ql/src/test/results/clientpositive/llap/annotate_stats_join_pkfk.q.out ## @@ -1191,14 +1191,6 @@ STAGE PLANS: sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE -Execution mode: vectorized, llap Review comment: I do not see projected columns, so it is difficult to confirm whether this is because of the schema. Is it? ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java ## @@ -247,6 +249,18 @@ public ParseContext transform(ParseContext pctx) throws SemanticException { } } +if (LOG.isDebugEnabled()) { + LOG.debug("Before SharedWorkOptimizer #2:\n" + Operator.toString(pctx.getTopOps().values())); +} + +// Execute shared work optimization +new BaseSharedWorkOptimizer().sharedWorkOptimization( +pctx, optimizerCache, tableNameToOps, sortedTables, false); + +if (LOG.isDebugEnabled()) { + LOG.debug("After SharedWorkOptimizer #2:\n" + Operator.toString(pctx.getTopOps().values())); Review comment: `After SharedWorkOptimizer merging TS schema`? ## File path: ql/src/test/results/clientpositive/llap/auto_join_reordering_values.q.out ## @@ -144,122 +144,30 @@ STAGE PLANS: tag: 0 value expressions: _col0 (type: int), _col2 (type: int), _col3 (type: int) auto parallelism: true -Execution mode: vectorized, llap -LLAP IO: no inputs -Path -> Alias: - A masked pattern was here -Path -> Partition: - A masked pattern was here -Partition - base file name: order
[GitHub] [hive] jcamachor merged pull request #1310: HIVE-23911: CBO fails when query has distinct in function and having clause
jcamachor merged pull request #1310: URL: https://github.com/apache/hive/pull/1310 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] dengzhhu653 edited a comment on pull request #1322: HIVE-23893/HIVE-14661,split filter into deterministic filter which will be push down and nondeterministic filter which keeps current
dengzhhu653 edited a comment on pull request #1322: URL: https://github.com/apache/hive/pull/1322#issuecomment-666864106 ... you can give a view on "Understanding Hive Branches" on the above link, and add some tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] dengzhhu653 commented on pull request #1322: HIVE-23893/HIVE-14661,split filter into deterministic filter which will be push down and nondeterministic filter which keeps current positi
dengzhhu653 commented on pull request #1322: URL: https://github.com/apache/hive/pull/1322#issuecomment-666864106 ... you can give a view on "Understanding Hive Branches" on the above link This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] jcamachor commented on a change in pull request #1326: HIVE-23941: Refactor TypeCheckProcFactory to be database agnostic
jcamachor commented on a change in pull request #1326: URL: https://github.com/apache/hive/pull/1326#discussion_r463353432 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java ## @@ -868,17 +849,15 @@ protected T getXpathOrFuncExprNodeDesc(ASTNode node, String funcText = getFunctionText(node, isFunction); T expr; if (funcText.equals(".")) { -// "." : FIELD Expression - assert (children.size() == 2); // Only allow constant field name for now assert (exprFactory.isConstantExpr(children.get(1))); T object = children.get(0); - + Review comment: nit. whitespace (there are a few below too) ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/TypeCheckProcFactory.java ## @@ -1063,19 +1007,20 @@ protected T getXpathOrFuncExprNodeDesc(ASTNode node, if (numEntries == 1) { children.addAll(expressions.asMap().values().iterator().next()); funcText = "in"; -genericUDF = new GenericUDFIn(); +fi = exprFactory.getFunctionInfo("in"); } else { +FunctionInfo inFunctionInfo = exprFactory.getFunctionInfo("in"); for (Collection c : expressions.asMap().values()) { - newExprs.add( - exprFactory.createFuncCallExpr( - new GenericUDFIn(), "in", (List) c)); + newExprs.add(exprFactory.createFuncCallExpr(null, inFunctionInfo, + "in", (List) c)); } children.addAll(newExprs); funcText = "or"; -genericUDF = new GenericUDFOPOr(); +fi = exprFactory.getFunctionInfo("or"); + functionInfoChangedFromIn = true; Review comment: nit. indentation off here too ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/HiveFunctionHelper.java ## @@ -545,4 +551,67 @@ public RexNode foldExpression(RexNode expr) { return result.get(0); } + /** + * {@inheritDoc} + */ + @Override + public boolean isAndFunction(FunctionInfo fi) { +return fi.getGenericUDF() instanceof GenericUDFOPAnd; + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isOrFunction(FunctionInfo fi) { +return fi.getGenericUDF() instanceof GenericUDFOPOr; + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isInFunction(FunctionInfo fi) { +return fi.getGenericUDF() instanceof GenericUDFIn; + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isCompareFunction(FunctionInfo fi) { +return fi.getGenericUDF() instanceof GenericUDFBaseCompare; + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isEqualFunction(FunctionInfo fi) { +return fi.getGenericUDF() instanceof GenericUDFOPEqual +&& !(fi.getGenericUDF() instanceof GenericUDFOPEqualNS); + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isConsistentWithinQuery(FunctionInfo fi) { +//TODO: don't getGenericUDF +return FunctionRegistry.isConsistentWithinQuery(fi.getGenericUDF()); + } + + /** + * {@inheritDoc} + */ + @Override + public boolean isStateful(FunctionInfo fi) { +//TODO: don't getGenericUDF Review comment: This TODO? Is there a follow-up? ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/type/FunctionHelper.java ## @@ -82,6 +82,20 @@ default RexNode foldExpression(RexNode expr) { return expr; } + boolean isAndFunction(FunctionInfo fi); Review comment: Would you mind to add a short comment to all these methods, even if may seem obvious for some of them? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/HiveFunctionInfo.java ## @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.exec; + +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF; +import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; +import org.apache.hadoop.hive.ql.udf.ptf.TableFunctionResolver; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * HiveFunctionInfo. Revie
[GitHub] [hive] pkumarsinha opened a new pull request #1343: HIVE-23960 : Partition with no column statistics leads to unbalanced …
pkumarsinha opened a new pull request #1343: URL: https://github.com/apache/hive/pull/1343 …calls to openTransaction/commitTransaction error during get_partitions_by_names ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvargacl commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvargacl commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463249932 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -1574,45 +1576,46 @@ public int compareTo(CompressedOwid other) { this.orcSplit = orcSplit; try { -final Path[] deleteDeltaDirs = getDeleteDeltaDirsFromSplit(orcSplit); -if (deleteDeltaDirs.length > 0) { +if (orcSplit.getDeltas().size() > 0) { AcidOutputFormat.Options orcSplitMinMaxWriteIds = AcidUtils.parseBaseOrDeltaBucketFilename(orcSplit.getPath(), conf); int totalDeleteEventCount = 0; - for (Path deleteDeltaDir : deleteDeltaDirs) { -if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { - continue; -} -Path[] deleteDeltaFiles = OrcRawRecordMerger.getDeltaFiles(deleteDeltaDir, bucket, -new OrcRawRecordMerger.Options().isCompacting(false), null); -for (Path deleteDeltaFile : deleteDeltaFiles) { - try { -ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag); -OrcTail orcTail = readerData.orcTail; -if (orcTail.getFooter().getNumberOfRows() <= 0) { - continue; // just a safe check to ensure that we are not reading empty delete files. -} -OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); -if (!deleteKeyInterval.isIntersects(keyInterval)) { - // If there is no intersection between data and delete delta, do not read delete file - continue; -} -// Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. -// For LLAP cases we need to create it here. -Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : -OrcFile.createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); -totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); -DeleteReaderValue deleteReaderValue = new DeleteReaderValue(deleteDeltaReader, -deleteDeltaFile, readerOptions, bucket, validWriteIdList, isBucketedTable, conf, -keyInterval, orcSplit); -DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); -if (deleteReaderValue.next(deleteRecordKey)) { - sortMerger.put(deleteRecordKey, deleteReaderValue); -} else { - deleteReaderValue.close(); + for (AcidInputFormat.DeltaMetaData deltaMetaData : orcSplit.getDeltas()) { +for (Path deleteDeltaDir : deltaMetaData.getPaths(orcSplit.getRootDir())) { + if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { +LOG.debug("Skipping delete delta dir {}", deleteDeltaDir); +continue; + } + for (AcidInputFormat.DeltaFileMetaData fileMetaData : deltaMetaData.getDeltaFiles()) { +Path deleteDeltaFile = fileMetaData.getPath(deleteDeltaDir, bucket); +try { + ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag, fileMetaData.getFileId(deleteDeltaDir, bucket)); + OrcTail orcTail = readerData.orcTail; + if (orcTail.getFooter().getNumberOfRows() <= 0) { +continue; // just a safe check to ensure that we are not reading empty delete files. + } + OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); + if (!deleteKeyInterval.isIntersects(keyInterval)) { +// If there is no intersection between data and delete delta, do not read delete file +continue; + } + // Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. + // For LLAP cases we need to create it here. + Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : OrcFile + .createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); + totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); + DeleteReaderValue deleteReaderValue = + new DeleteReaderValue(deleteDeltaReader, deleteDeltaFile, readerOptions, bucket, validWriteIdList, + isBucketedTable, conf, keyInterval, orcSplit); + DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); + if (deleteReaderValue.next(deleteRecordK
[GitHub] [hive] pvargacl commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvargacl commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463237211 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -680,14 +681,15 @@ public void setBaseAndInnerReader( * @param path The Orc file path we want to get the OrcTail for * @param conf The Configuration to access LLAP * @param cacheTag The cacheTag needed to get OrcTail from LLAP IO cache + * @param fileKey fileId of the Orc file (either the Long fileId of HDFS or the SyntheticFileId) * @return ReaderData object where the orcTail is not null. Reader can be null, but if we had to create * one we return that as well for further reuse. */ - private static ReaderData getOrcTail(Path path, Configuration conf, CacheTag cacheTag) throws IOException { + private static ReaderData getOrcTail(Path path, Configuration conf, CacheTag cacheTag, Object fileKey) throws IOException { Review comment: I am not sure, we will ever want to use it. without fileId. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463171159 ## File path: ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidInputFormat.java ## @@ -83,6 +94,34 @@ public void testDeltaMetaConstructWithState() throws Exception { assertThat(deltaMetaData.getStmtIds().get(2), is(99)); } + @Test + public void testDeltaMetaWithFile() throws Exception { Review comment: Test for all of the different serialization options This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463169065 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -1574,45 +1576,46 @@ public int compareTo(CompressedOwid other) { this.orcSplit = orcSplit; try { -final Path[] deleteDeltaDirs = getDeleteDeltaDirsFromSplit(orcSplit); -if (deleteDeltaDirs.length > 0) { +if (orcSplit.getDeltas().size() > 0) { AcidOutputFormat.Options orcSplitMinMaxWriteIds = AcidUtils.parseBaseOrDeltaBucketFilename(orcSplit.getPath(), conf); int totalDeleteEventCount = 0; - for (Path deleteDeltaDir : deleteDeltaDirs) { -if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { - continue; -} -Path[] deleteDeltaFiles = OrcRawRecordMerger.getDeltaFiles(deleteDeltaDir, bucket, -new OrcRawRecordMerger.Options().isCompacting(false), null); -for (Path deleteDeltaFile : deleteDeltaFiles) { - try { -ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag); -OrcTail orcTail = readerData.orcTail; -if (orcTail.getFooter().getNumberOfRows() <= 0) { - continue; // just a safe check to ensure that we are not reading empty delete files. -} -OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); -if (!deleteKeyInterval.isIntersects(keyInterval)) { - // If there is no intersection between data and delete delta, do not read delete file - continue; -} -// Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. -// For LLAP cases we need to create it here. -Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : -OrcFile.createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); -totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); -DeleteReaderValue deleteReaderValue = new DeleteReaderValue(deleteDeltaReader, -deleteDeltaFile, readerOptions, bucket, validWriteIdList, isBucketedTable, conf, -keyInterval, orcSplit); -DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); -if (deleteReaderValue.next(deleteRecordKey)) { - sortMerger.put(deleteRecordKey, deleteReaderValue); -} else { - deleteReaderValue.close(); + for (AcidInputFormat.DeltaMetaData deltaMetaData : orcSplit.getDeltas()) { +for (Path deleteDeltaDir : deltaMetaData.getPaths(orcSplit.getRootDir())) { + if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { +LOG.debug("Skipping delete delta dir {}", deleteDeltaDir); +continue; + } + for (AcidInputFormat.DeltaFileMetaData fileMetaData : deltaMetaData.getDeltaFiles()) { +Path deleteDeltaFile = fileMetaData.getPath(deleteDeltaDir, bucket); +try { + ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag, fileMetaData.getFileId(deleteDeltaDir, bucket)); + OrcTail orcTail = readerData.orcTail; + if (orcTail.getFooter().getNumberOfRows() <= 0) { +continue; // just a safe check to ensure that we are not reading empty delete files. + } + OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); + if (!deleteKeyInterval.isIntersects(keyInterval)) { +// If there is no intersection between data and delete delta, do not read delete file +continue; + } + // Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. + // For LLAP cases we need to create it here. + Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : OrcFile + .createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); + totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); + DeleteReaderValue deleteReaderValue = + new DeleteReaderValue(deleteDeltaReader, deleteDeltaFile, readerOptions, bucket, validWriteIdList, + isBucketedTable, conf, keyInterval, orcSplit); + DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); + if (deleteReaderValue.next(deleteRecordKey)
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463168246 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -1574,45 +1576,46 @@ public int compareTo(CompressedOwid other) { this.orcSplit = orcSplit; try { -final Path[] deleteDeltaDirs = getDeleteDeltaDirsFromSplit(orcSplit); -if (deleteDeltaDirs.length > 0) { +if (orcSplit.getDeltas().size() > 0) { AcidOutputFormat.Options orcSplitMinMaxWriteIds = AcidUtils.parseBaseOrDeltaBucketFilename(orcSplit.getPath(), conf); int totalDeleteEventCount = 0; - for (Path deleteDeltaDir : deleteDeltaDirs) { -if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { - continue; -} -Path[] deleteDeltaFiles = OrcRawRecordMerger.getDeltaFiles(deleteDeltaDir, bucket, -new OrcRawRecordMerger.Options().isCompacting(false), null); -for (Path deleteDeltaFile : deleteDeltaFiles) { - try { -ReaderData readerData = getOrcTail(deleteDeltaFile, conf, cacheTag); -OrcTail orcTail = readerData.orcTail; -if (orcTail.getFooter().getNumberOfRows() <= 0) { - continue; // just a safe check to ensure that we are not reading empty delete files. -} -OrcRawRecordMerger.KeyInterval deleteKeyInterval = findDeleteMinMaxKeys(orcTail, deleteDeltaFile); -if (!deleteKeyInterval.isIntersects(keyInterval)) { - // If there is no intersection between data and delete delta, do not read delete file - continue; -} -// Reader can be reused if it was created before for getting orcTail: mostly for non-LLAP cache cases. -// For LLAP cases we need to create it here. -Reader deleteDeltaReader = readerData.reader != null ? readerData.reader : -OrcFile.createReader(deleteDeltaFile, OrcFile.readerOptions(conf)); -totalDeleteEventCount += deleteDeltaReader.getNumberOfRows(); -DeleteReaderValue deleteReaderValue = new DeleteReaderValue(deleteDeltaReader, -deleteDeltaFile, readerOptions, bucket, validWriteIdList, isBucketedTable, conf, -keyInterval, orcSplit); -DeleteRecordKey deleteRecordKey = new DeleteRecordKey(); -if (deleteReaderValue.next(deleteRecordKey)) { - sortMerger.put(deleteRecordKey, deleteReaderValue); -} else { - deleteReaderValue.close(); + for (AcidInputFormat.DeltaMetaData deltaMetaData : orcSplit.getDeltas()) { +for (Path deleteDeltaDir : deltaMetaData.getPaths(orcSplit.getRootDir())) { + if (!isQualifiedDeleteDeltaForSplit(orcSplitMinMaxWriteIds, deleteDeltaDir)) { Review comment: isQualifiedDeleteDelta basically reparses the delta dir. Can we prevent this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463166514 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java ## @@ -680,14 +681,15 @@ public void setBaseAndInnerReader( * @param path The Orc file path we want to get the OrcTail for * @param conf The Configuration to access LLAP * @param cacheTag The cacheTag needed to get OrcTail from LLAP IO cache + * @param fileKey fileId of the Orc file (either the Long fileId of HDFS or the SyntheticFileId) * @return ReaderData object where the orcTail is not null. Reader can be null, but if we had to create * one we return that as well for further reuse. */ - private static ReaderData getOrcTail(Path path, Configuration conf, CacheTag cacheTag) throws IOException { + private static ReaderData getOrcTail(Path path, Configuration conf, CacheTag cacheTag, Object fileKey) throws IOException { Review comment: Maybe 2 different getOrcTail method on the LLAP IO interface? @szlta? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463163936 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -1165,8 +1168,14 @@ private static ParsedDelta parseDelta(Path path, String deltaPrefix, FileSystem throws IOException { ParsedDelta p = parsedDelta(path, deltaPrefix, fs, dirSnapshot); boolean isDeleteDelta = deltaPrefix.equals(DELETE_DELTA_PREFIX); +List files = new ArrayList<>(); +for (FileStatus fileStatus : dirSnapshot.getFiles()) { Review comment: Nit: maybe do it in java8 way? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463163160 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ## @@ -1113,10 +1119,13 @@ else if(statementId != parsedDelta.statementId) { && (last.getMinWriteId() == parsedDelta.getMinWriteId()) && (last.getMaxWriteId() == parsedDelta.getMaxWriteId())) { last.getStmtIds().add(parsedDelta.getStatementId()); +for(HadoopShims.HdfsFileStatusWithId fileStatus : parsedDelta.getFiles()) { Review comment: Nit: space before for This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463159912 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java ## @@ -118,70 +123,183 @@ */ private long visibilityTxnId; +private List deltaFiles; + public DeltaMetaData() { - this(0,0,new ArrayList(), 0); + this(0, 0, new ArrayList<>(), 0, new ArrayList<>()); } /** + * @param minWriteId min writeId of the delta directory + * @param maxWriteId max writeId of the delta directory * @param stmtIds delta dir suffixes when a single txn writes > 1 delta in the same partition * @param visibilityTxnId maybe 0, if the dir name didn't have it. txnid:0 is always visible + * @param deltaFileStatuses bucketFiles in the directory */ -DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId) { +DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId, +List deltaFileStatuses) { this.minWriteId = minWriteId; this.maxWriteId = maxWriteId; if (stmtIds == null) { throw new IllegalArgumentException("stmtIds == null"); } this.stmtIds = stmtIds; this.visibilityTxnId = visibilityTxnId; + this.deltaFiles = new ArrayList<>(); + for(HadoopShims.HdfsFileStatusWithId fileStatus : deltaFileStatuses) { +deltaFiles.add(new DeltaFileMetaData(fileStatus)); + } } + long getMinWriteId() { return minWriteId; } + long getMaxWriteId() { return maxWriteId; } + List getStmtIds() { return stmtIds; } + long getVisibilityTxnId() { return visibilityTxnId; } + +public List getDeltaFiles() { + return deltaFiles; +} + @Override public void write(DataOutput out) throws IOException { out.writeLong(minWriteId); out.writeLong(maxWriteId); out.writeInt(stmtIds.size()); - for(Integer id : stmtIds) { + for (Integer id : stmtIds) { out.writeInt(id); } out.writeLong(visibilityTxnId); + out.writeInt(deltaFiles.size()); + for (DeltaFileMetaData fileMeta : deltaFiles) { +fileMeta.write(out); + } } + @Override public void readFields(DataInput in) throws IOException { minWriteId = in.readLong(); maxWriteId = in.readLong(); stmtIds.clear(); int numStatements = in.readInt(); - for(int i = 0; i < numStatements; i++) { + for (int i = 0; i < numStatements; i++) { stmtIds.add(in.readInt()); } visibilityTxnId = in.readLong(); + + deltaFiles.clear(); + int numFiles = in.readInt(); + for(int i = 0; i< numFiles; i++) { +DeltaFileMetaData file = new DeltaFileMetaData(); +file.readFields(in); +deltaFiles.add(file); + } } -String getName() { +private String getName() { assert stmtIds.isEmpty() : "use getName(int)"; return AcidUtils.addVisibilitySuffix(AcidUtils .deleteDeltaSubdir(minWriteId, maxWriteId), visibilityTxnId); } -String getName(int stmtId) { +private String getName(int stmtId) { assert !stmtIds.isEmpty() : "use getName()"; return AcidUtils.addVisibilitySuffix(AcidUtils .deleteDeltaSubdir(minWriteId, maxWriteId, stmtId), visibilityTxnId); } + +public List getPaths(Path root) { + if (stmtIds.isEmpty()) { +return Collections.singletonList(new Path(root, getName())); + } else { +// To support multistatement transactions we may have multiple directories corresponding to one DeltaMetaData +return getStmtIds().stream().map(stmtId -> new Path(root, getName(stmtId))).collect(Collectors.toList()); + } +} @Override public String toString() { return "Delta(?," + minWriteId + "," + maxWriteId + "," + stmtIds + "," + visibilityTxnId + ")"; } } + final class DeltaFileMetaData implements Writable { +private static final int HAS_LONG_FILEID_FLAG = 1; +private static final int HAS_ATTEMPTID_FLAG = 2; + +private long modTime; +private long length; +// Optional +private Integer attemptId; +// Optional +private Long fileId; + +public DeltaFileMetaData() { +} + +public DeltaFileMetaData(HadoopShims.HdfsFileStatusWithId fileStatus) { + modTime = fileStatus.getFileStatus().getModificationTime(); + length = fileStatus.getFileStatus().getLen(); + String attempt = AcidUtils.parseAttemptId(fileStatus.getFileStatus().getPath()); + attemptId = StringUtils.isEmpty(attempt) ? null : Integer.parseInt(attempt); + fileId = fileStatus.getFileId(); +} + +public DeltaFileMetaData(long modTime, long length, @Nullable Integer attemptId, @Nullable Long fileId) { + this.modTime = modTime; + this.length = length;
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463156735 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java ## @@ -118,70 +123,183 @@ */ private long visibilityTxnId; +private List deltaFiles; + public DeltaMetaData() { - this(0,0,new ArrayList(), 0); + this(0, 0, new ArrayList<>(), 0, new ArrayList<>()); } /** + * @param minWriteId min writeId of the delta directory + * @param maxWriteId max writeId of the delta directory * @param stmtIds delta dir suffixes when a single txn writes > 1 delta in the same partition * @param visibilityTxnId maybe 0, if the dir name didn't have it. txnid:0 is always visible + * @param deltaFileStatuses bucketFiles in the directory */ -DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId) { +DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId, +List deltaFileStatuses) { this.minWriteId = minWriteId; this.maxWriteId = maxWriteId; if (stmtIds == null) { throw new IllegalArgumentException("stmtIds == null"); } this.stmtIds = stmtIds; this.visibilityTxnId = visibilityTxnId; + this.deltaFiles = new ArrayList<>(); + for(HadoopShims.HdfsFileStatusWithId fileStatus : deltaFileStatuses) { +deltaFiles.add(new DeltaFileMetaData(fileStatus)); + } } + long getMinWriteId() { return minWriteId; } + long getMaxWriteId() { return maxWriteId; } + List getStmtIds() { return stmtIds; } + long getVisibilityTxnId() { return visibilityTxnId; } + +public List getDeltaFiles() { + return deltaFiles; +} + @Override public void write(DataOutput out) throws IOException { out.writeLong(minWriteId); out.writeLong(maxWriteId); out.writeInt(stmtIds.size()); - for(Integer id : stmtIds) { + for (Integer id : stmtIds) { out.writeInt(id); } out.writeLong(visibilityTxnId); + out.writeInt(deltaFiles.size()); + for (DeltaFileMetaData fileMeta : deltaFiles) { +fileMeta.write(out); + } } + @Override public void readFields(DataInput in) throws IOException { minWriteId = in.readLong(); maxWriteId = in.readLong(); stmtIds.clear(); int numStatements = in.readInt(); - for(int i = 0; i < numStatements; i++) { + for (int i = 0; i < numStatements; i++) { stmtIds.add(in.readInt()); } visibilityTxnId = in.readLong(); + + deltaFiles.clear(); + int numFiles = in.readInt(); + for(int i = 0; i< numFiles; i++) { +DeltaFileMetaData file = new DeltaFileMetaData(); +file.readFields(in); +deltaFiles.add(file); + } } -String getName() { +private String getName() { assert stmtIds.isEmpty() : "use getName(int)"; return AcidUtils.addVisibilitySuffix(AcidUtils .deleteDeltaSubdir(minWriteId, maxWriteId), visibilityTxnId); } -String getName(int stmtId) { +private String getName(int stmtId) { assert !stmtIds.isEmpty() : "use getName()"; return AcidUtils.addVisibilitySuffix(AcidUtils .deleteDeltaSubdir(minWriteId, maxWriteId, stmtId), visibilityTxnId); } + +public List getPaths(Path root) { + if (stmtIds.isEmpty()) { +return Collections.singletonList(new Path(root, getName())); + } else { +// To support multistatement transactions we may have multiple directories corresponding to one DeltaMetaData +return getStmtIds().stream().map(stmtId -> new Path(root, getName(stmtId))).collect(Collectors.toList()); + } +} @Override public String toString() { return "Delta(?," + minWriteId + "," + maxWriteId + "," + stmtIds + "," + visibilityTxnId + ")"; } } + final class DeltaFileMetaData implements Writable { Review comment: Nit: newline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463156327 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java ## @@ -118,70 +123,183 @@ */ private long visibilityTxnId; +private List deltaFiles; + public DeltaMetaData() { - this(0,0,new ArrayList(), 0); + this(0, 0, new ArrayList<>(), 0, new ArrayList<>()); } /** Review comment: Nit: newline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463151233 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java ## @@ -118,70 +123,183 @@ */ private long visibilityTxnId; +private List deltaFiles; + public DeltaMetaData() { - this(0,0,new ArrayList(), 0); + this(0, 0, new ArrayList<>(), 0, new ArrayList<>()); } /** + * @param minWriteId min writeId of the delta directory + * @param maxWriteId max writeId of the delta directory * @param stmtIds delta dir suffixes when a single txn writes > 1 delta in the same partition * @param visibilityTxnId maybe 0, if the dir name didn't have it. txnid:0 is always visible + * @param deltaFileStatuses bucketFiles in the directory */ -DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId) { +DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId, +List deltaFileStatuses) { this.minWriteId = minWriteId; this.maxWriteId = maxWriteId; if (stmtIds == null) { throw new IllegalArgumentException("stmtIds == null"); } this.stmtIds = stmtIds; this.visibilityTxnId = visibilityTxnId; + this.deltaFiles = new ArrayList<>(); + for(HadoopShims.HdfsFileStatusWithId fileStatus : deltaFileStatuses) { +deltaFiles.add(new DeltaFileMetaData(fileStatus)); + } } + long getMinWriteId() { return minWriteId; } + long getMaxWriteId() { return maxWriteId; } + List getStmtIds() { return stmtIds; } + long getVisibilityTxnId() { return visibilityTxnId; } + +public List getDeltaFiles() { + return deltaFiles; +} + @Override public void write(DataOutput out) throws IOException { out.writeLong(minWriteId); out.writeLong(maxWriteId); out.writeInt(stmtIds.size()); - for(Integer id : stmtIds) { + for (Integer id : stmtIds) { out.writeInt(id); } out.writeLong(visibilityTxnId); + out.writeInt(deltaFiles.size()); + for (DeltaFileMetaData fileMeta : deltaFiles) { +fileMeta.write(out); + } } + @Override public void readFields(DataInput in) throws IOException { minWriteId = in.readLong(); maxWriteId = in.readLong(); stmtIds.clear(); int numStatements = in.readInt(); - for(int i = 0; i < numStatements; i++) { + for (int i = 0; i < numStatements; i++) { stmtIds.add(in.readInt()); } visibilityTxnId = in.readLong(); + + deltaFiles.clear(); + int numFiles = in.readInt(); + for(int i = 0; i< numFiles; i++) { +DeltaFileMetaData file = new DeltaFileMetaData(); +file.readFields(in); +deltaFiles.add(file); + } } -String getName() { +private String getName() { assert stmtIds.isEmpty() : "use getName(int)"; return AcidUtils.addVisibilitySuffix(AcidUtils .deleteDeltaSubdir(minWriteId, maxWriteId), visibilityTxnId); } -String getName(int stmtId) { +private String getName(int stmtId) { assert !stmtIds.isEmpty() : "use getName()"; return AcidUtils.addVisibilitySuffix(AcidUtils .deleteDeltaSubdir(minWriteId, maxWriteId, stmtId), visibilityTxnId); } + +public List getPaths(Path root) { + if (stmtIds.isEmpty()) { +return Collections.singletonList(new Path(root, getName())); + } else { +// To support multistatement transactions we may have multiple directories corresponding to one DeltaMetaData +return getStmtIds().stream().map(stmtId -> new Path(root, getName(stmtId))).collect(Collectors.toList()); + } +} @Override Review comment: Nit: newline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvary commented on a change in pull request #1339: URL: https://github.com/apache/hive/pull/1339#discussion_r463150602 ## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java ## @@ -118,70 +123,183 @@ */ private long visibilityTxnId; +private List deltaFiles; + public DeltaMetaData() { - this(0,0,new ArrayList(), 0); + this(0, 0, new ArrayList<>(), 0, new ArrayList<>()); } /** + * @param minWriteId min writeId of the delta directory + * @param maxWriteId max writeId of the delta directory * @param stmtIds delta dir suffixes when a single txn writes > 1 delta in the same partition * @param visibilityTxnId maybe 0, if the dir name didn't have it. txnid:0 is always visible + * @param deltaFileStatuses bucketFiles in the directory */ -DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId) { +DeltaMetaData(long minWriteId, long maxWriteId, List stmtIds, long visibilityTxnId, +List deltaFileStatuses) { this.minWriteId = minWriteId; this.maxWriteId = maxWriteId; if (stmtIds == null) { throw new IllegalArgumentException("stmtIds == null"); } this.stmtIds = stmtIds; this.visibilityTxnId = visibilityTxnId; + this.deltaFiles = new ArrayList<>(); + for(HadoopShims.HdfsFileStatusWithId fileStatus : deltaFileStatuses) { Review comment: Nit: maybe a space after the for keyword? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] zeroflag commented on a change in pull request #1323: HIVE-23723. Limit operator pushdown through LOJ (amagyar)
zeroflag commented on a change in pull request #1323: URL: https://github.com/apache/hive/pull/1323#discussion_r463125357 ## File path: ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out ## @@ -108,7 +111,7 @@ limit 1 POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here -0 val_0 0 val_0 +238val_238 238 val_238 Review comment: I'll try but I don't think it will help in this case. My understanding is that SORT_QUERY_RESULTS sorts the rows on the standard output on the client/driver side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk merged pull request #1303: HIVE-23915: Improve Github PR Template
kgyrtkirk merged pull request #1303: URL: https://github.com/apache/hive/pull/1303 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] risdenk opened a new pull request #1342: HIVE-23958: HiveServer2 should support additional keystore/truststore types besides JKS
risdenk opened a new pull request #1342: URL: https://github.com/apache/hive/pull/1342 Ensures that HS2 uses the default JDK keystore/truststore type when interacting with Thrift and Jetty. It isn't possible to add a test for this since FIPS compatible keystore/truststore needs special TLS provider and isn't shipped w/ the JDK. Using JKS or PKCS12 would work in either case since the JDK supports both in JDK 8 and 11. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk opened a new pull request #1341: HIVE-23959 bulk stat removal
kgyrtkirk opened a new pull request #1341: URL: https://github.com/apache/hive/pull/1341 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] zeroflag commented on a change in pull request #1323: HIVE-23723. Limit operator pushdown through LOJ (amagyar)
zeroflag commented on a change in pull request #1323: URL: https://github.com/apache/hive/pull/1323#discussion_r463115806 ## File path: ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out ## @@ -1207,7 +1223,6 @@ limit 1 offset 1 POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here -0 val_0 0 val_0 Review comment: Good catch, it might be because there is an offset=1 belongs to the limit. The Limit is pushed through the LOJ, but the original one is also kept. The first limit emits 1 row, but the second emits 0 because of offset=1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] maheshk114 opened a new pull request #1340: HIVE-23953 : Use task counter information to compute keycount during hashtable loading
maheshk114 opened a new pull request #1340: URL: https://github.com/apache/hive/pull/1340 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] zeroflag commented on a change in pull request #1323: HIVE-23723. Limit operator pushdown through LOJ (amagyar)
zeroflag commented on a change in pull request #1323: URL: https://github.com/apache/hive/pull/1323#discussion_r463107179 ## File path: ql/src/test/results/clientpositive/llap/input14_limit.q.out ## @@ -75,19 +75,19 @@ STAGE PLANS: Reducer 2 Execution mode: vectorized, llap Reduce Operator Tree: - Select Operator -expressions: VALUE._col0 (type: string), VALUE._col1 (type: string) -outputColumnNames: _col0, _col1 -Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE Column stats: COMPLETE -Limit - Number of rows: 20 - Statistics: Num rows: 20 Data size: 3560 Basic stats: COMPLETE Column stats: COMPLETE - Top N Key Operator -sort order: + -keys: _col0 (type: string) -null sort order: a + Limit Review comment: I agree, opened HIVE-23957 about it.. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] maheshk114 merged pull request #1293: HIVE-23894: SubmitDag should not be retried incase of query cancel
maheshk114 merged pull request #1293: URL: https://github.com/apache/hive/pull/1293 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvargacl opened a new pull request #1339: HIVE-23956: Delete delta fileIds should be pushed execution
pvargacl opened a new pull request #1339: URL: https://github.com/apache/hive/pull/1339 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk commented on pull request #1252: HIVE-23847: Extracting hive-parser module broke exec jar upload in tez
kgyrtkirk commented on pull request #1252: URL: https://github.com/apache/hive/pull/1252#issuecomment-666403793 we should get this in sooner than later - I've also bumped into this issue; and had to figure it out again...I didn't connected the dots This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kgyrtkirk merged pull request #1272: HIVE-23727: Improve SQLOperation log handling when canceling background
kgyrtkirk merged pull request #1272: URL: https://github.com/apache/hive/pull/1272 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] soumyakanti3578 commented on a change in pull request #1317: HIVE-23949: Introduce caching layer in HS2 to accelerate query compilation
soumyakanti3578 commented on a change in pull request #1317: URL: https://github.com/apache/hive/pull/1317#discussion_r463036109 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -3665,33 +3678,39 @@ public boolean dropPartition(String dbName, String tableName, List parti * @return list of partition objects */ public List getPartitions(Table tbl) throws HiveException { -if (tbl.isPartitioned()) { - List tParts; - try { -GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); -req.setTblName(tbl.getTableName()); -req.setDbName(tbl.getDbName()); -req.setUserName(getUserName()); -req.setMaxParts((short) -1); -req.setGroupNames(getGroupNames()); -if (AcidUtils.isTransactionalTable(tbl)) { - ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); - req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); -} -GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); -tParts = res.getPartitions(); +long t1 = System.nanoTime(); +try { + if (tbl.isPartitioned()) { +List tParts; +try { + GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); + req.setTblName(tbl.getTableName()); + req.setDbName(tbl.getDbName()); + req.setUserName(getUserName()); + req.setMaxParts((short) -1); + req.setGroupNames(getGroupNames()); + if (AcidUtils.isTransactionalTable(tbl)) { +ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); +req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); + } + GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); + tParts = res.getPartitions(); - } catch (Exception e) { -LOG.error(StringUtils.stringifyException(e)); -throw new HiveException(e); - } - List parts = new ArrayList<>(tParts.size()); - for (org.apache.hadoop.hive.metastore.api.Partition tpart : tParts) { -parts.add(new Partition(tbl, tpart)); +} catch (Exception e) { + LOG.error(StringUtils.stringifyException(e)); + throw new HiveException(e); +} +List parts = new ArrayList<>(tParts.size()); +for (org.apache.hadoop.hive.metastore.api.Partition tpart : tParts) { + parts.add(new Partition(tbl, tpart)); +} +return parts; + } else { +return Collections.singletonList(new Partition(tbl)); } - return parts; -} else { - return Collections.singletonList(new Partition(tbl)); +} finally { Review comment: Since there are many instances for 2., I am moving all `PerfLogEnd` to `finally` block for now, which will probably be easier to maintain in the future, because of uniformity. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] soumyakanti3578 commented on a change in pull request #1317: HIVE-23949: Introduce caching layer in HS2 to accelerate query compilation
soumyakanti3578 commented on a change in pull request #1317: URL: https://github.com/apache/hive/pull/1317#discussion_r463036109 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -3665,33 +3678,39 @@ public boolean dropPartition(String dbName, String tableName, List parti * @return list of partition objects */ public List getPartitions(Table tbl) throws HiveException { -if (tbl.isPartitioned()) { - List tParts; - try { -GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); -req.setTblName(tbl.getTableName()); -req.setDbName(tbl.getDbName()); -req.setUserName(getUserName()); -req.setMaxParts((short) -1); -req.setGroupNames(getGroupNames()); -if (AcidUtils.isTransactionalTable(tbl)) { - ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); - req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); -} -GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); -tParts = res.getPartitions(); +long t1 = System.nanoTime(); +try { + if (tbl.isPartitioned()) { +List tParts; +try { + GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); + req.setTblName(tbl.getTableName()); + req.setDbName(tbl.getDbName()); + req.setUserName(getUserName()); + req.setMaxParts((short) -1); + req.setGroupNames(getGroupNames()); + if (AcidUtils.isTransactionalTable(tbl)) { +ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); +req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); + } + GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); + tParts = res.getPartitions(); - } catch (Exception e) { -LOG.error(StringUtils.stringifyException(e)); -throw new HiveException(e); - } - List parts = new ArrayList<>(tParts.size()); - for (org.apache.hadoop.hive.metastore.api.Partition tpart : tParts) { -parts.add(new Partition(tbl, tpart)); +} catch (Exception e) { + LOG.error(StringUtils.stringifyException(e)); + throw new HiveException(e); +} +List parts = new ArrayList<>(tParts.size()); +for (org.apache.hadoop.hive.metastore.api.Partition tpart : tParts) { + parts.add(new Partition(tbl, tpart)); +} +return parts; + } else { +return Collections.singletonList(new Partition(tbl)); } - return parts; -} else { - return Collections.singletonList(new Partition(tbl)); +} finally { Review comment: Since there are many instances for 2., I am moving all `PerfLogEnd` to `finally` block for now, which will probably be easier to maintain in the future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] zeroflag opened a new pull request #1338: HIVE-23937. Take null ordering into consideration when pushing TNK through inner joins (amagyar)
zeroflag opened a new pull request #1338: URL: https://github.com/apache/hive/pull/1338 Work in progress ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] jcamachor merged pull request #1305: HIVE-23892: Remove interpretation for character RexLiteral
jcamachor merged pull request #1305: URL: https://github.com/apache/hive/pull/1305 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] letsflyinthesky commented on pull request #1322: HIVE-23893/HIVE-14661,split filter into deterministic filter which will be push down and nondeterministic filter which keeps current po
letsflyinthesky commented on pull request #1322: URL: https://github.com/apache/hive/pull/1322#issuecomment-666349043 May you give me some advice about why to remove. My commit could don't resolve the problem or the way I the commi dose not conform to regulation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] dengzhhu653 commented on pull request #1322: HIVE-23893/HIVE-14661,split filter into deterministic filter which will be push down and nondeterministic filter which keeps current positi
dengzhhu653 commented on pull request #1322: URL: https://github.com/apache/hive/pull/1322#issuecomment-666340785 Hey @letsflyinthesky, you can move your fixes upon the master branch. There are some detailed useful guidelines of how to contribute on https://cwiki.apache.org/confluence/display/Hive/HowToContribute. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] zabetak commented on pull request #1315: [HIVE-23951] Support parameterized queries in WHERE/HAVING clause
zabetak commented on pull request #1315: URL: https://github.com/apache/hive/pull/1315#issuecomment-666334140 > @jcamachor @zabetak I was thinking of changing all the existing tpcds queries to replace literals with `?` and run explain on tpcds queries (using TestTezPerfCliDriver). This should provide us with some good test coverage. What I am undecided on is if it worth pushing it to hive repo and have a separate clidriver testing execute/prepare for tpcds. What do you suggest? If we want to avoid regressions I guess having the queries in the repo makes sense. Apart from that I guess that not all of the tpcds queries are needed to ensure code coverage. Possibly after testing you will end-up with a subset that is sufficient. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] kasakrisz commented on a change in pull request #1323: HIVE-23723. Limit operator pushdown through LOJ (amagyar)
kasakrisz commented on a change in pull request #1323: URL: https://github.com/apache/hive/pull/1323#discussion_r462852316 ## File path: ql/src/test/results/clientpositive/llap/explainuser_1.q.out ## @@ -124,27 +124,25 @@ Stage-3 <-Reducer 2 [SIMPLE_EDGE] llap File Output Operator [FS_7] table:{"name:":"default.src_orc_merge_test_part_n1"} - Select Operator [SEL_6] (rows=100 width=95) + Select Operator [SEL_4] (rows=100 width=178) Output:["_col0","_col1"] Limit [LIM_5] (rows=100 width=178) Number of rows:100 - Select Operator [SEL_4] (rows=100 width=178) -Output:["_col0","_col1"] - <-Map 1 [CUSTOM_SIMPLE_EDGE] llap -PARTITION_ONLY_SHUFFLE [RS_3] - Limit [LIM_2] (rows=100 width=178) -Number of rows:100 -Select Operator [SEL_1] (rows=500 width=178) - Output:["_col0","_col1"] - TableScan [TS_0] (rows=500 width=178) - default@src,src,Tbl:COMPLETE,Col:COMPLETE,Output:["key","value"] +<-Map 1 [CUSTOM_SIMPLE_EDGE] llap + PARTITION_ONLY_SHUFFLE [RS_3] +Limit [LIM_2] (rows=100 width=178) Review comment: Why couldn't be pushed through`[SEL_1]` ? ## File path: ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out ## @@ -108,7 +111,7 @@ limit 1 POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here -0 val_0 0 val_0 +238val_238 238 val_238 Review comment: Does `-- SORT_QUERY_RESULTS` help avoid resultset change? ## File path: ql/src/test/results/clientpositive/llap/input14_limit.q.out ## @@ -75,19 +75,19 @@ STAGE PLANS: Reducer 2 Execution mode: vectorized, llap Reduce Operator Tree: - Select Operator -expressions: VALUE._col0 (type: string), VALUE._col1 (type: string) -outputColumnNames: _col0, _col1 -Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE Column stats: COMPLETE -Limit - Number of rows: 20 - Statistics: Num rows: 20 Data size: 3560 Basic stats: COMPLETE Column stats: COMPLETE - Top N Key Operator -sort order: + -keys: _col0 (type: string) -null sort order: a + Limit Review comment: It is worth investigating why we end up with a plan where the parent of a `TNK` is a `Limit` but that can be a follow-up patch since it is out of scope of this patch. ## File path: ql/src/test/results/clientpositive/llap/limit_join_transpose.q.out ## @@ -1207,7 +1223,6 @@ limit 1 offset 1 POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here -0 val_0 0 val_0 Review comment: Result set disappeared. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] aasha commented on a change in pull request #1329: HIVE-23897 : Create a common Retry Interface for replication
aasha commented on a change in pull request #1329: URL: https://github.com/apache/hive/pull/1329#discussion_r462835685 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/util/Retryable.java ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.exec.util; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.utils.SecurityUtils; +import org.apache.hadoop.security.UserGroupInformation; + +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.Callable; +import java.util.concurrent.TimeUnit; + +/** + * Class to implement any retry logic in case of exceptions. + */ +public class Retryable { + private static long MINIMUM_DELAY_IN_SEC = 1; + + private long totalDurationInSeconds; + private List> retryOn; + private List> failOn; + private long initialDelayInSeconds; + private long maxRetryDelayInSeconds; + private double backOff; + private int maxJitterInSeconds; + + private Retryable() { +this.retryOn = new ArrayList<>(); +this.failOn = new ArrayList<>(); +this.initialDelayInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_INTIAL_DELAY.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_INTIAL_DELAY), TimeUnit.SECONDS); +this.maxRetryDelayInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_MAX_DELAY_BETWEEN_RETRIES.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_MAX_DELAY_BETWEEN_RETRIES), TimeUnit.SECONDS); +this.backOff = HiveConf.ConfVars.REPL_RETRY_BACKOFF_COEFFICIENT.defaultFloatVal; +this.maxJitterInSeconds = (int) HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_JITTER.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_JITTER), TimeUnit.SECONDS); +this.totalDurationInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_TOTAL_DURATION.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_TOTAL_DURATION), TimeUnit.SECONDS);; + } + + public static Builder builder() { +return new Builder(); + } + + public T executeCallable(Callable callable) throws Exception { +long startTime = System.currentTimeMillis(); +long delay = this.initialDelayInSeconds; +Exception currentCapturedException = null; +while(true) { + try { +if (UserGroupInformation.isSecurityEnabled()) { + SecurityUtils.reloginExpiringKeytabUser(); + return UserGroupInformation.getLoginUser().doAs((PrivilegedExceptionAction) () -> callable.call()); +} else { + return callable.call(); +} + } catch (Exception e) { +if (this.failOn.stream().noneMatch(k -> e.getClass().equals(k)) + && this.retryOn.stream().anyMatch(k -> e.getClass().isAssignableFrom(k))) { + if (elapsedTimeInSeconds(startTime) + delay > this.totalDurationInSeconds) { +// case where waiting would go beyond max duration. So throw exception and return +throw e; + } + sleep(delay); + //retry case. compute next sleep time + delay = getNextDelay(delay, currentCapturedException, e); + // reset current captured exception. + currentCapturedException = e; +} else { + // Exception cannot be retried on. Throw exception and return + throw e; +} + } +} + } + + private void sleep(long seconds) { +try { + Thread.sleep(seconds * 1000); +} catch (InterruptedException e) { + // no-op.. just proceed +} + } + + private long getNextDelay(long currentDelay, final Exception previousException, final Exception currentException) { +if (previousException != null && !previousException.getClass().equals(currentException.getClass())) { + // New exception encountered. Returning initial delay for next retry. + return this.initialDelayInSeconds; +} + +if (currentDelay <= 0) { // in case initial delay was set to 0. + currentDelay = MINIMUM_DELAY_IN_SEC; +} + +currentDelay *= this.backOff; +
[GitHub] [hive] aasha commented on a change in pull request #1329: HIVE-23897 : Create a common Retry Interface for replication
aasha commented on a change in pull request #1329: URL: https://github.com/apache/hive/pull/1329#discussion_r462835685 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/util/Retryable.java ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.exec.util; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.utils.SecurityUtils; +import org.apache.hadoop.security.UserGroupInformation; + +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.Callable; +import java.util.concurrent.TimeUnit; + +/** + * Class to implement any retry logic in case of exceptions. + */ +public class Retryable { + private static long MINIMUM_DELAY_IN_SEC = 1; + + private long totalDurationInSeconds; + private List> retryOn; + private List> failOn; + private long initialDelayInSeconds; + private long maxRetryDelayInSeconds; + private double backOff; + private int maxJitterInSeconds; + + private Retryable() { +this.retryOn = new ArrayList<>(); +this.failOn = new ArrayList<>(); +this.initialDelayInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_INTIAL_DELAY.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_INTIAL_DELAY), TimeUnit.SECONDS); +this.maxRetryDelayInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_MAX_DELAY_BETWEEN_RETRIES.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_MAX_DELAY_BETWEEN_RETRIES), TimeUnit.SECONDS); +this.backOff = HiveConf.ConfVars.REPL_RETRY_BACKOFF_COEFFICIENT.defaultFloatVal; +this.maxJitterInSeconds = (int) HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_JITTER.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_JITTER), TimeUnit.SECONDS); +this.totalDurationInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_TOTAL_DURATION.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_TOTAL_DURATION), TimeUnit.SECONDS);; + } + + public static Builder builder() { +return new Builder(); + } + + public T executeCallable(Callable callable) throws Exception { +long startTime = System.currentTimeMillis(); +long delay = this.initialDelayInSeconds; +Exception currentCapturedException = null; +while(true) { + try { +if (UserGroupInformation.isSecurityEnabled()) { + SecurityUtils.reloginExpiringKeytabUser(); + return UserGroupInformation.getLoginUser().doAs((PrivilegedExceptionAction) () -> callable.call()); +} else { + return callable.call(); +} + } catch (Exception e) { +if (this.failOn.stream().noneMatch(k -> e.getClass().equals(k)) + && this.retryOn.stream().anyMatch(k -> e.getClass().isAssignableFrom(k))) { + if (elapsedTimeInSeconds(startTime) + delay > this.totalDurationInSeconds) { +// case where waiting would go beyond max duration. So throw exception and return +throw e; + } + sleep(delay); + //retry case. compute next sleep time + delay = getNextDelay(delay, currentCapturedException, e); + // reset current captured exception. + currentCapturedException = e; +} else { + // Exception cannot be retried on. Throw exception and return + throw e; +} + } +} + } + + private void sleep(long seconds) { +try { + Thread.sleep(seconds * 1000); +} catch (InterruptedException e) { + // no-op.. just proceed +} + } + + private long getNextDelay(long currentDelay, final Exception previousException, final Exception currentException) { +if (previousException != null && !previousException.getClass().equals(currentException.getClass())) { + // New exception encountered. Returning initial delay for next retry. + return this.initialDelayInSeconds; +} + +if (currentDelay <= 0) { // in case initial delay was set to 0. + currentDelay = MINIMUM_DELAY_IN_SEC; +} + +currentDelay *= this.backOff; +
[GitHub] [hive] vineetgarg02 commented on pull request #1315: [HIVE-23951] Support parameterized queries in WHERE/HAVING clause
vineetgarg02 commented on pull request #1315: URL: https://github.com/apache/hive/pull/1315#issuecomment-665947086 @jcamachor @zabetak I was thinking of changing all the existing tpcds queries to replace literals with `?` and run explain on tpcds queries (using TestTezPerfCliDriver). This should provide us with some good test coverage. What I am undecided on is if it worth pushing it to hive repo and have a separate clidriver testing execute/prepare for tpcds. What do you suggest? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] sam-an-cloudera commented on a change in pull request #1221: HIVE-23786: HMS server side filter with Ranger
sam-an-cloudera commented on a change in pull request #1221: URL: https://github.com/apache/hive/pull/1221#discussion_r462496195 ## File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java ## @@ -312,5 +558,9 @@ private String getCurrentUser() { private String getCurrentUser(HiveMetaStoreAuthorizableEvent authorizableEvent) { return authorizableEvent.getAuthzContext().getUGI().getShortUserName(); } + + private UserGroupInformation getUGI() throws IOException { + return UserGroupInformation.getCurrentUser(); Review comment: all corrected. How did you spot them so quickly? I wish I have a tool to find out before I submit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] jcamachor commented on a change in pull request #1317: HIVE-23949: Introduce caching layer in HS2 to accelerate query compilation
jcamachor commented on a change in pull request #1317: URL: https://github.com/apache/hive/pull/1317#discussion_r462584582 ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -3665,33 +3678,39 @@ public boolean dropPartition(String dbName, String tableName, List parti * @return list of partition objects */ public List getPartitions(Table tbl) throws HiveException { -if (tbl.isPartitioned()) { - List tParts; - try { -GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); -req.setTblName(tbl.getTableName()); -req.setDbName(tbl.getDbName()); -req.setUserName(getUserName()); -req.setMaxParts((short) -1); -req.setGroupNames(getGroupNames()); -if (AcidUtils.isTransactionalTable(tbl)) { - ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); - req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); -} -GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); -tParts = res.getPartitions(); +long t1 = System.nanoTime(); +try { + if (tbl.isPartitioned()) { +List tParts; +try { + GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); + req.setTblName(tbl.getTableName()); + req.setDbName(tbl.getDbName()); + req.setUserName(getUserName()); + req.setMaxParts((short) -1); + req.setGroupNames(getGroupNames()); + if (AcidUtils.isTransactionalTable(tbl)) { +ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); +req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); + } + GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); + tParts = res.getPartitions(); - } catch (Exception e) { -LOG.error(StringUtils.stringifyException(e)); -throw new HiveException(e); - } - List parts = new ArrayList<>(tParts.size()); - for (org.apache.hadoop.hive.metastore.api.Partition tpart : tParts) { -parts.add(new Partition(tbl, tpart)); +} catch (Exception e) { + LOG.error(StringUtils.stringifyException(e)); + throw new HiveException(e); +} +List parts = new ArrayList<>(tParts.size()); +for (org.apache.hadoop.hive.metastore.api.Partition tpart : tParts) { + parts.add(new Partition(tbl, tpart)); +} +return parts; + } else { +return Collections.singletonList(new Partition(tbl)); } - return parts; -} else { - return Collections.singletonList(new Partition(tbl)); +} finally { Review comment: I did not catch this comment in the review. I was taking a look at the code and I think there are two problems with current approach (sorry about the back and forth). 1) Currently we are logging information twice (PerfLogger already has its own logging mechanism when logging is set to debug level). We should remove the logging lines outside of it. 2) I think we should have left the PerfLogEnd in the `finally` block. I saw in some methods that we catch an exception and ignore it, which could lead to leaks since we do not call PerfLogEnd. Even if currently this may not be a problem, it would be difficult to verify that nothing changes in the future there. I think the safest path is just to make that call in the `finally` block to make sure it is always called. ## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ## @@ -3665,33 +3678,39 @@ public boolean dropPartition(String dbName, String tableName, List parti * @return list of partition objects */ public List getPartitions(Table tbl) throws HiveException { -if (tbl.isPartitioned()) { - List tParts; - try { -GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); -req.setTblName(tbl.getTableName()); -req.setDbName(tbl.getDbName()); -req.setUserName(getUserName()); -req.setMaxParts((short) -1); -req.setGroupNames(getGroupNames()); -if (AcidUtils.isTransactionalTable(tbl)) { - ValidWriteIdList validWriteIdList = getValidWriteIdList(tbl.getDbName(), tbl.getTableName()); - req.setValidWriteIdList(validWriteIdList != null ? validWriteIdList.toString() : null); -} -GetPartitionsPsWithAuthResponse res = getMSC().listPartitionsWithAuthInfoRequest(req); -tParts = res.getPartitions(); +long t1 = System.nanoTime(); +try { + if (tbl.isPartitioned()) { +List tParts; +try { + GetPartitionsPsWithAuthRequest req = new GetPartitionsPsWithAuthRequest(); + req.setTblName(
[GitHub] [hive] aasha commented on a change in pull request #1329: HIVE-23897 : Create a common Retry Interface for replication
aasha commented on a change in pull request #1329: URL: https://github.com/apache/hive/pull/1329#discussion_r462697552 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/util/Retryable.java ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.exec.util; + +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.utils.SecurityUtils; +import org.apache.hadoop.security.UserGroupInformation; + +import java.security.PrivilegedExceptionAction; +import java.util.ArrayList; +import java.util.List; +import java.util.Random; +import java.util.concurrent.Callable; +import java.util.concurrent.TimeUnit; + +/** + * Class to implement any retry logic in case of exceptions. + */ +public class Retryable { + private static long MINIMUM_DELAY_IN_SEC = 1; + + private long totalDurationInSeconds; + private List> retryOn; + private List> failOn; + private long initialDelayInSeconds; + private long maxRetryDelayInSeconds; + private double backOff; + private int maxJitterInSeconds; + + private Retryable() { +this.retryOn = new ArrayList<>(); +this.failOn = new ArrayList<>(); +this.initialDelayInSeconds = HiveConf.toTime(HiveConf.ConfVars.REPL_RETRY_INTIAL_DELAY.defaultStrVal, + HiveConf.getDefaultTimeUnit(HiveConf.ConfVars.REPL_RETRY_INTIAL_DELAY), TimeUnit.SECONDS); Review comment: Didn't get this comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] HunterL edited a comment on pull request #1324: HIVE-23939: SharedWorkOptimizer: take the union of columns in mergeable TableScans
HunterL edited a comment on pull request #1324: URL: https://github.com/apache/hive/pull/1324#issuecomment-665824148 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] soumyakanti3578 commented on a change in pull request #1317: HIVE-23949: Introduce caching layer in HS2 to accelerate query compilation
soumyakanti3578 commented on a change in pull request #1317: URL: https://github.com/apache/hive/pull/1317#discussion_r462602014 ## File path: standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientWithLocalCache.java ## @@ -0,0 +1,264 @@ +package org.apache.hadoop.hive.metastore; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.PartitionsByExprRequest; +import org.apache.hadoop.hive.metastore.api.PartitionsByExprResult; +import org.apache.hadoop.hive.metastore.api.PartitionsSpecByExprResult; +import org.apache.hadoop.hive.metastore.conf.MetastoreConf; +import org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator; +import org.apache.hadoop.hive.ql.util.IncrementalObjectSizeEstimator.ObjectEstimator; +import org.apache.thrift.TException; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.Objects; + +public class HiveMetaStoreClientWithLocalCache extends HiveMetaStoreClient { + + private static Cache mscLocalCache = null; + //TODO: initialize in the init method + private static boolean IS_CACHE_ENABLED; + private static long MAX_SIZE; + private static boolean RECORD_STATS; + private static HashMap, ObjectEstimator> sizeEstimator = null; + private static String cacheObjName = null; + + public static synchronized void init() { +if (mscLocalCache != null) return; // init cache only once +Configuration metaConf = MetastoreConf.newMetastoreConf(); +LOG.debug("Initializing local cache in HiveMetaStoreClient..."); +MAX_SIZE = MetastoreConf.getSizeVar(metaConf, MetastoreConf.ConfVars.MSC_CACHE_MAX_SIZE); +IS_CACHE_ENABLED = MetastoreConf.getBoolVar(metaConf, MetastoreConf.ConfVars.MSC_CACHE_ENABLED); +RECORD_STATS = MetastoreConf.getBoolVar(metaConf, MetastoreConf.ConfVars.MSC_CACHE_RECORD_STATS); +initSizeEstimator(); +initCache(); +LOG.debug("Local cache initialized in HiveMetaStoreClient: " + mscLocalCache); + } + + public HiveMetaStoreClientWithLocalCache(Configuration conf) throws MetaException { +this(conf, null, true); + } + + public HiveMetaStoreClientWithLocalCache(Configuration conf, HiveMetaHookLoader hookLoader) throws MetaException { +this(conf, hookLoader, true); + } + + public HiveMetaStoreClientWithLocalCache(Configuration conf, HiveMetaHookLoader hookLoader, Boolean allowEmbedded) throws MetaException { +super(conf, hookLoader, allowEmbedded); + } + + private static void initSizeEstimator() { +sizeEstimator = new HashMap<>(); +IncrementalObjectSizeEstimator.createEstimators(CacheKey.class, sizeEstimator); +Arrays.stream(KeyType.values()).forEach(e -> { + IncrementalObjectSizeEstimator.createEstimators(e.keyClass, sizeEstimator); + IncrementalObjectSizeEstimator.createEstimators(e.valueClass, sizeEstimator);} +); + } + + /** + * KeyType is used to differentiate the request types. More types can be added in future. + */ + public enum KeyType { +PARTITIONS_BY_EXPR(PartitionsByExprRequest.class, PartitionsByExprResult.class), +PARTITIONS_SPEC_BY_EXPR(PartitionsByExprRequest.class, PartitionsSpecByExprResult.class); + +private final Class keyClass; +private final Class valueClass; + +KeyType(Class keyClass, Class valueClass) { + this.keyClass = keyClass; + this.valueClass = valueClass; +} + } + + /** + * CacheKey objects are used as key for the cache. + */ + public static class CacheKey{ +KeyType IDENTIFIER; +Object obj; + +public CacheKey(KeyType IDENTIFIER, Object obj) { + this.IDENTIFIER = IDENTIFIER; + this.obj = obj; +} + +@Override +public boolean equals(Object o) { + if (this == o) { +return true; + } + if (o == null || getClass() != o.getClass()) { +return false; + } + CacheKey cacheKey = (CacheKey) o; + return IDENTIFIER == cacheKey.IDENTIFIER && + Objects.equals(obj, cacheKey.obj); +} + +@Override +public int hashCode() { + return Objects.hash(IDENTIFIER, obj); +} + } + + private static int getWeight(CacheKey key, Object val) { +ObjectEstimator keySizeEstimator = sizeEstimator.get(key.getClass()); +ObjectEstimator valSizeEstimator = sizeEstimator.get(key.IDENTIFIER.valueClass); +int keySize = keySizeEstimator.estimate(key, sizeEstimator); +int valSize = valSizeEstimator.estimate(val, sizeEstimator); +if (LOG.isDebugEnabled()) { + LOG.debug("Cache entry weight - key: {}, value: {}, total: {}", keySize, valSize, keySize + valSize); +} +return keySize + valSize; + } + + private Object load(CacheKey key) { +try { + return getResultObject(key); +} catch (TException e) { + throw ne
[GitHub] [hive] HunterL commented on pull request #1324: HIVE-23939: SharedWorkOptimizer: take the union of columns in mergeable TableScans
HunterL commented on pull request #1324: URL: https://github.com/apache/hive/pull/1324#issuecomment-665824148 So looking through the Q tests (specifically `auto_join_reordering_values.q.out`) we have `explain select s.s_store_sk from store_n0 s join store_sales_n0 ss on (s.s_store_sk = ss.ss_store_sk) join store_n0 s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 1000` In the explain plan this produces `filterExpr: (((s_floor_space > 1000) and s_store_sk is not null) or s_store_sk is not null)` This filter expression can be reduced down to `s_store_sk is not null` e.g. ([P && Q] || Q) => Q This isn't correct since its going to ignore the `where s.s_floor_space > 1000` It seems to me that the correct behavior would be to either... 1. Continue to combine them naively but with an `and` instead of an `or` and allow something downstream to optimize it out. e.g. ([P && Q] && Q) which will get reduced down to (P && Q) 2. Optimize the statement here so that when they are combined you just get (P && Q) There are similar issues in other Q tests but I chose this one for simplicity, if this isn't clear I'd be happy to clarify more. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pkumarsinha commented on a change in pull request #1329: HIVE-23897 : Create a common Retry Interface for replication
pkumarsinha commented on a change in pull request #1329: URL: https://github.com/apache/hive/pull/1329#discussion_r462125753 ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -617,6 +617,22 @@ private static void populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal "Name of the source cluster for the replication."), REPL_TARGET_CLUSTER_NAME("hive.repl.target.cluster.name", null, "Name of the target cluster for the replication."), +REPL_RETRY_INTIAL_DELAY("hive.repl.retry.initial.delay", "60s", + new TimeValidator(TimeUnit.SECONDS), + "Initial Delay before retry starts."), +REPL_RETRY_BACKOFF_COEFFICIENT("hive.repl.retry.backoff.coefficient", 1.2f, + "The backoff coefficient for exponential retry delay between retries. " + +"Previous Delay * Backoff Coefficient will determine the next retry interval"), +REPL_RETRY_JITTER("hive.repl.retry.jitter", "30s", new TimeValidator(TimeUnit.SECONDS), + "A random jitter to be applied to avoid all retries happening at the same time."), + REPL_RETRY_MAX_DELAY_BETWEEN_RETRIES("hive.repl.retry.max.delay.between.retries", "60m", + new TimeValidator(TimeUnit.MINUTES), + "Maximum allowed retry delay in seconds after including exponential backoff. " + Review comment: Default is in minutes but the comment is in second. Should we change this? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/atlas/RetryingClientTimeBased.java ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.exec.repl.atlas; + +import com.sun.jersey.api.client.UniformInterfaceException; +import org.apache.atlas.AtlasServiceException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Random; +import java.util.concurrent.Callable; + +/** + * Implement retry logic for service calls. + */ +public class RetryingClientTimeBased { + private static long MINIMUM_DELAY_IN_SEC = 1; + private static final Logger LOG = LoggerFactory.getLogger(RetryingClientTimeBased.class); + private static final String ERROR_MESSAGE_NO_ENTITIES = "no entities to create/update"; + private static final String ERROR_MESSAGE_IN_PROGRESS = "import or export is in progress"; + private static final String ATLAS_ERROR_CODE_IMPORT_EMPTY_ZIP = "empty ZIP file"; + protected long totalDurationInSeconds; + protected long initialDelayInSeconds; + protected long maxRetryDelayInSeconds; + protected double backOff; + protected int maxJitterInSeconds; + + protected T invokeWithRetry(Callable func, T defaultReturnValue) throws Exception { +long startTime = System.currentTimeMillis(); +long delay = this.initialDelayInSeconds; +while (elapsedTimeInSeconds(startTime) + delay > this.totalDurationInSeconds) { + try { +LOG.debug("Retrying method: {}", func.getClass().getName(), null); +return func.call(); + } catch (Exception e) { +if (processImportExportLockException(e, delay)) { + //retry case. compute next sleep time + delay = getNextDelay(delay); + continue; +} +if (processInvalidParameterException(e)) { + return null; +} +LOG.error(func.getClass().getName(), e); +throw new Exception(e); + } +} +return defaultReturnValue; + } + + private long getNextDelay(long currentDelay) { +if (currentDelay <= 0) { // in case initial delay was set to 0. + currentDelay = MINIMUM_DELAY_IN_SEC; +} + Review comment: nit:can consider formatting this file ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/util/Retryable.java ## @@ -0,0 +1,212 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the Licens
[GitHub] [hive] kasakrisz commented on pull request #1324: HIVE-23939: SharedWorkOptimizer: take the union of columns in mergeable TableScans
kasakrisz commented on pull request #1324: URL: https://github.com/apache/hive/pull/1324#issuecomment-666155487 @HunterL Thanks for reviewing this patch. The expression ``` filterExpr: (((s_floor_space > 1000) and s_store_sk is not null) or s_store_sk is not null) ``` is a result of merging two `TableScanOperator`s. Both of them are scanning the same table: `alias: s` but they had different `filterExpr`. TS1: ((s_floor_space > 1000) and s_store_sk is not null) TS2: s_store_sk is not null SharedWorkOptimizer naively combined the filter expressions using `or` because we need the union of the records produced by both TS. You are right in this particular case the filter expression could be reduced to `s_store_sk is not null` The new TS has two children ``` TableScan alias: s filterExpr: (((s_floor_space > 1000) and s_store_sk is not null) or s_store_sk is not null) (type: boolean) Filter Operator predicate: ((s_floor_space > 1000) and s_store_sk is not null) (type: boolean) ... Filter Operator predicate: s_store_sk is not null (type: boolean) ... ``` both of them are `Filter operators` which are the root of subtrees to broadcast the proper subset of records to each reducer edge (Reducer 2 and Reducer 3) If `and` were used for combining the filter expressions of TS operators the branch which does not have the filter `s_floor_space > 1000` would loose a subset of records. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] github-actions[bot] closed pull request #1028: HIVE-23519 : Read Ranger Configs from Classpath
github-actions[bot] closed pull request #1028: URL: https://github.com/apache/hive/pull/1028 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] dengzhhu653 closed pull request #1033: HIVE-23546: Skip authorization when user is a superuser
dengzhhu653 closed pull request #1033: URL: https://github.com/apache/hive/pull/1033 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] pvary commented on a change in pull request #1221: HIVE-23786: HMS server side filter with Ranger
pvary commented on a change in pull request #1221: URL: https://github.com/apache/hive/pull/1221#discussion_r462725998 ## File path: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java ## @@ -312,5 +558,9 @@ private String getCurrentUser() { private String getCurrentUser(HiveMetaStoreAuthorizableEvent authorizableEvent) { return authorizableEvent.getAuthzContext().getUGI().getShortUserName(); } + + private UserGroupInformation getUGI() throws IOException { + return UserGroupInformation.getCurrentUser(); Review comment: Too many years seeing Hive code. Basically it hurts my eyes :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org
[GitHub] [hive] mustafaiman opened a new pull request #1337: HIVE-23952:
mustafaiman opened a new pull request #1337: URL: https://github.com/apache/hive/pull/1337 Change-Id: If398aba0130f28c17d18340c20ba681c278f ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HIVE-X: Fix a typo in YYY) For more details, please see https://cwiki.apache.org/confluence/display/Hive/HowToContribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org