[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2022-04-25 Thread Raymond Xu (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17527477#comment-17527477
 ] 

Raymond Xu commented on HUDI-1015:
--

This is for taking 1 more pass on the code paths. cc [~guoyihua] [~shivnarayan]

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, writer-core
>Reporter: Vinoth Chandar
>Assignee: Ethan Guo
>Priority: Critical
> Fix For: 0.10.0, 0.12.0
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-11-03 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438126#comment-17438126
 ] 

sivabalan narayanan commented on HUDI-1015:
---

This is being tracked via https://issues.apache.org/jira/browse/HUDI-2005

 

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: sivabalan narayanan
>Priority: Major
> Fix For: 0.10.0
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-10-04 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17423970#comment-17423970
 ] 

sivabalan narayanan commented on HUDI-1015:
---

sure. I will take a look at all sub-tasks on this and either close it out or 
assign to folks. 

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: sivabalan narayanan
>Priority: Blocker
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-09-12 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413794#comment-17413794
 ] 

Vinoth Chandar commented on HUDI-1015:
--

[~shivnarayan] please always assign the ticket to yourself, if you are working 
on it, close/resolve as needed

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.10.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-07-20 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384614#comment-17384614
 ] 

sivabalan narayanan commented on HUDI-1015:
---

{code:java}
grep -irl "fs.listFiles" hudi-*/* | grep -v Test
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
{code}

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-07-20 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384616#comment-17384616
 ] 

sivabalan narayanan commented on HUDI-1015:
---

{code:java}
grep -irl "fs.getFileStatus" hudi-*/* | grep -v Test
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/heartbeat/HoodieHeartbeatClient.java
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/io/FlinkMergeHandle.java
hudi-common/src/main/java/org/apache/hudi/common/util/TablePathUtils.java
hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java
hudi-common/src/main/java/org/apache/hudi/common/fs/inline/InLineFileSystem.java
hudi-common/src/main/java/org/apache/hudi/common/fs/FailSafeConsistencyGuard.java
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
hudi-common/src/main/java/org/apache/hudi/exception/TableNotFoundException.java
hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/profile/WriteProfiles.java
hudi-flink/src/main/java/org/apache/hudi/table/format/FilePathUtils.java
hudi-flink/src/main/java/org/apache/hudi/table/format/cow/CopyOnWriteInputFormat.java
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieInputFormatUtils.java
hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java
{code}

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-07-20 Thread sivabalan narayanan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384613#comment-17384613
 ] 

sivabalan narayanan commented on HUDI-1015:
---

{{}}
{code:java}
grep -irl "fs.listStatus" hudi-*/* | grep -v Test
hudi-cli/src/main/java/org/apache/hudi/cli/commands/MetadataCommand.java
hudi-cli/src/main/java/org/apache/hudi/cli/commands/HoodieLogFileCommand.java
hudi-cli/src/main/scala/org/apache/hudi/cli/DedupeSparkJob.scala
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/MarkerFiles.java
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/bootstrap/BootstrapUtils.java
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/heartbeat/HoodieHeartbeatClient.java
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/rollback/ListingBasedRollbackHelper.java
hudi-client/hudi-java-client/src/main/java/org/apache/hudi/table/action/rollback/JavaListingBasedRollbackHelper.java
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/rollback/ListingBasedRollbackHelper.java
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java
hudi-common/src/main/java/org/apache/hudi/common/fs/FailSafeConsistencyGuard.java
hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
hudi-flink/src/main/java/org/apache/hudi/table/format/FilePathUtils.java
hudi-flink/src/main/java/org/apache/hudi/table/format/cow/CopyOnWriteInputFormat.java
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateAsyncOperations.java
hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateDatasetNode.java
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieSnapshotCopier.java
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/HiveIncrPullSource.java
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DFSPathSelector.java
hudi-utilities/src/main/java/org/apache/hudi/utilities/checkpointing/KafkaConnectHdfsProvider.java


{code}
{{}}

 

 

 

 

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2021-07-05 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375242#comment-17375242
 ] 

Vinoth Chandar commented on HUDI-1015:
--

[~codope] this is also something worth looking at?

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.9.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-08-07 Thread Balaji Varadarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173242#comment-17173242
 ] 

Balaji Varadarajan commented on HUDI-1015:
--

Subtasks added to track all location where we list all partitions. 
https://issues.apache.org/jira/browse/HUDI-1170 to track the above log file 
listing case.

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Balaji Varadarajan
>Priority: Major
> Fix For: 0.6.1
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-07-22 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163151#comment-17163151
 ] 

Vinoth Chandar commented on HUDI-1015:
--

let's create sub tasks here?

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Balaji Varadarajan
>Priority: Blocker
> Fix For: 0.6.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-07-22 Thread Balaji Varadarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17163018#comment-17163018
 ] 

Balaji Varadarajan commented on HUDI-1015:
--

Another place where we do listing in executor : 

(Source : [https://github.com/apache/hudi/issues/1852])

: 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:352)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsHttpOperation.processResponse(AbfsHttpOperation.java:259)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:167)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:124)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:180)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listFiles(AzureBlobFileSystemStore.java:549)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:628)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:532)
shaded.databricks.v20180920_b33d810.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:344)
org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1517)
org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1557)
org.apache.hudi.common.fs.HoodieWrapperFileSystem.listStatus(HoodieWrapperFileSystem.java:487)
org.apache.hudi.common.fs.FSUtils.getAllLogFiles(FSUtils.java:409)
org.apache.hudi.common.fs.FSUtils.getLatestLogVersion(FSUtils.java:420)
org.apache.hudi.common.fs.FSUtils.computeNextLogVersion(FSUtils.java:434)
org.apache.hudi.common.model.HoodieLogFile.rollOver(HoodieLogFile.java:115)
org.apache.hudi.common.table.log.HoodieLogFormatWriter.(HoodieLogFormatWriter.java:101)
org.apache.hudi.common.table.log.HoodieLogFormat$WriterBuilder.build(HoodieLogFormat.java:249)
org.apache.hudi.io.HoodieAppendHandle.createLogWriter(HoodieAppendHandle.java:291)
org.apache.hudi.io.HoodieAppendHandle.init(HoodieAppendHandle.java:141)
org.apache.hudi.io.HoodieAppendHandle.doAppend(HoodieAppendHandle.java:197)
org.apache.hudi.table.action.deltacommit.DeltaCommitActionExecutor.handleUpdate(DeltaCommitActionExecutor.java:77)
org.apache.hudi.table.action.commit.BaseCommitActionExecutor.handleUpsertPartition(BaseCommitActionExecutor.java:246)
org.apache.hudi.table.action.commit.BaseCommitActionExecutor.lambda$execute$caffe4c4$1(BaseCommitActionExecutor.java:102)
org.apache.hudi.table.action.commit.BaseCommitActionExecutor$$Lambda$192/1449069739.call(Unknown
 Source)
org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:105)

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Assignee: Balaji Varadarajan
>Priority: Blocker
> Fix For: 0.6.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-06-20 Thread Vinoth Chandar (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141271#comment-17141271
 ] 

Vinoth Chandar commented on HUDI-1015:
--

Hi renyi , goal here is to audit all these calls that may be listing the entire 
table (outside of cleaning and rollback, which have their own jira s addressing 
this) and see if we can make them more intelligent by only listing some 
partitions 

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.6.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HUDI-1015) Audit all getAllPartitionPaths() calls and keep em out of fast path

2020-06-18 Thread renyi.bao (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-1015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139179#comment-17139179
 ] 

renyi.bao commented on HUDI-1015:
-

hi, [~vinoth] ,Is there a more detailed description of the issue。I have review 
the implment of getAllPartitionPaths and all the caller. Do you mean to ensure 
that all the parameters passed in are valid, or do you want a more reasonable 
implement?

> Audit all getAllPartitionPaths() calls and keep em out of fast path
> ---
>
> Key: HUDI-1015
> URL: https://issues.apache.org/jira/browse/HUDI-1015
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: Common Core, Writer Core
>Reporter: Vinoth Chandar
>Priority: Blocker
> Fix For: 0.6.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)