[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527477#comment-17527477 ] Raymond Xu commented on HUDI-1015: -- This is for taking 1 more pass on the code paths. cc [~guoyihua] [~shivnarayan] > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, writer-core >Reporter: Vinoth Chandar >Assignee: Ethan Guo >Priority: Critical > Fix For: 0.10.0, 0.12.0 > > Original Estimate: 4h > Remaining Estimate: 4h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438126#comment-17438126 ] sivabalan narayanan commented on HUDI-1015: --- This is being tracked via https://issues.apache.org/jira/browse/HUDI-2005 > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: sivabalan narayanan >Priority: Major > Fix For: 0.10.0 > > Original Estimate: 4h > Remaining Estimate: 4h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17423970#comment-17423970 ] sivabalan narayanan commented on HUDI-1015: --- sure. I will take a look at all sub-tasks on this and either close it out or assign to folks. > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: sivabalan narayanan >Priority: Blocker > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413794#comment-17413794 ] Vinoth Chandar commented on HUDI-1015: -- [~shivnarayan] please always assign the ticket to yourself, if you are working on it, close/resolve as needed > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.10.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384614#comment-17384614 ] sivabalan narayanan commented on HUDI-1015: --- {code:java} grep -irl "fs.listFiles" hudi-*/* | grep -v Test hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java {code} > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384616#comment-17384616 ] sivabalan narayanan commented on HUDI-1015: --- {code:java} grep -irl "fs.getFileStatus" hudi-*/* | grep -v Test hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/heartbeat/HoodieHeartbeatClient.java hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/FlinkMergeHandle.java hudi-common/src/main/java/org/apache/hudi/common/util/TablePathUtils.java hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFileSystem.java hudi-common/src/main/java/org/apache/hudi/common/fs/FailSafeConsistencyGuard.java hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java hudi-common/src/main/java/org/apache/hudi/exception/TableNotFoundException.java hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/profile/WriteProfiles.java hudi-flink/src/main/java/org/apache/hudi/table/format/FilePathUtils.java hudi-flink/src/main/java/org/apache/hudi/table/format/cow/CopyOnWriteInputFormat.java hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java {code} > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384613#comment-17384613 ] sivabalan narayanan commented on HUDI-1015: --- {{}} {code:java} grep -irl "fs.listStatus" hudi-*/* | grep -v Test hudi-cli/src/main/java/org/apache/hudi/cli/commands/MetadataCommand.java hudi-cli/src/main/java/org/apache/hudi/cli/commands/HoodieLogFileCommand.java hudi-cli/src/main/scala/org/apache/hudi/cli/DedupeSparkJob.scala hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/MarkerFiles.java hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/bootstrap/BootstrapUtils.java hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/heartbeat/HoodieHeartbeatClient.java hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/rollback/ListingBasedRollbackHelper.java hudi-client/hudi-java-client/src/main/java/org/apache/hudi/table/action/rollback/JavaListingBasedRollbackHelper.java hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/rollback/ListingBasedRollbackHelper.java hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java hudi-common/src/main/java/org/apache/hudi/common/fs/FailSafeConsistencyGuard.java hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java hudi-flink/src/main/java/org/apache/hudi/table/format/FilePathUtils.java hudi-flink/src/main/java/org/apache/hudi/table/format/cow/CopyOnWriteInputFormat.java hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateAsyncOperations.java hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateDatasetNode.java hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieSnapshotCopier.java hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/HiveIncrPullSource.java hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DFSPathSelector.java hudi-utilities/src/main/java/org/apache/hudi/utilities/checkpointing/KafkaConnectHdfsProvider.java {code} {{}} > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375242#comment-17375242 ] Vinoth Chandar commented on HUDI-1015: -- [~codope] this is also something worth looking at? > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Vinoth Chandar >Priority: Blocker > Fix For: 0.9.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173242#comment-17173242 ] Balaji Varadarajan commented on HUDI-1015: -- Subtasks added to track all location where we list all partitions. https://issues.apache.org/jira/browse/HUDI-1170 to track the above log file listing case. > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Balaji Varadarajan >Priority: Major > Fix For: 0.6.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163151#comment-17163151 ] Vinoth Chandar commented on HUDI-1015: -- let's create sub tasks here? > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Balaji Varadarajan >Priority: Blocker > Fix For: 0.6.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163018#comment-17163018 ] Balaji Varadarajan commented on HUDI-1015: -- Another place where we do listing in executor : (Source : [https://github.com/apache/hudi/issues/1852]) : sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsHttpOperation.processResponse(AbfsHttpOperation.java:259) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:167) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:124) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:180) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listFiles(AzureBlobFileSystemStore.java:549) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:628) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:532) shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:344) org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517) org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1557) org.apache.hudi.common.fs.HoodieWrapperFileSystem.listStatus(HoodieWrapperFileSystem.java:487) org.apache.hudi.common.fs.FSUtils.getAllLogFiles(FSUtils.java:409) org.apache.hudi.common.fs.FSUtils.getLatestLogVersion(FSUtils.java:420) org.apache.hudi.common.fs.FSUtils.computeNextLogVersion(FSUtils.java:434) org.apache.hudi.common.model.HoodieLogFile.rollOver(HoodieLogFile.java:115) org.apache.hudi.common.table.log.HoodieLogFormatWriter.(HoodieLogFormatWriter.java:101) org.apache.hudi.common.table.log.HoodieLogFormat$WriterBuilder.build(HoodieLogFormat.java:249) org.apache.hudi.io.HoodieAppendHandle.createLogWriter(HoodieAppendHandle.java:291) org.apache.hudi.io.HoodieAppendHandle.init(HoodieAppendHandle.java:141) org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:197) org.apache.hudi.table.action.deltacommit.DeltaCommitActionExecutor.handleUpdate(DeltaCommitActionExecutor.java:77) org.apache.hudi.table.action.commit.BaseCommitActionExecutor.handleUpsertPartition(BaseCommitActionExecutor.java:246) org.apache.hudi.table.action.commit.BaseCommitActionExecutor.lambda$execute$caffe4c4$1(BaseCommitActionExecutor.java:102) org.apache.hudi.table.action.commit.BaseCommitActionExecutor$$Lambda$192/1449069739.call(Unknown Source) org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:105) > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Assignee: Balaji Varadarajan >Priority: Blocker > Fix For: 0.6.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141271#comment-17141271 ] Vinoth Chandar commented on HUDI-1015: -- Hi renyi , goal here is to audit all these calls that may be listing the entire table (outside of cleaning and rollback, which have their own jira s addressing this) and see if we can make them more intelligent by only listing some partitions > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Priority: Blocker > Fix For: 0.6.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path
[ https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139179#comment-17139179 ] renyi.bao commented on HUDI-1015: - hi, [~vinoth] ,Is there a more detailed description of the issue。I have review the implment of getAllPartitionPaths and all the caller. Do you mean to ensure that all the parameters passed in are valid, or do you want a more reasonable implement? > Audit all getAllPartitionPaths() calls and keep em out of fast path > --- > > Key: HUDI-1015 > URL: https://issues.apache.org/jira/browse/HUDI-1015 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core, Writer Core >Reporter: Vinoth Chandar >Priority: Blocker > Fix For: 0.6.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)