[GitHub] [hudi] codecov-commenter edited a comment on pull request #1699: [HUDI-989]Support long options for prepare_integration_suite
codecov-commenter edited a comment on pull request #1699: URL: https://github.com/apache/hudi/pull/1699#issuecomment-638613017 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=h1) Report > Merging [#1699](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=desc) into [hudi_test_suite_refactor](https://codecov.io/gh/apache/hudi/commit/6a0f4191ac34fc393d72f36530c8573273d4d045&el=desc) will **decrease** coverage by `0.01%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1699/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## hudi_test_suite_refactor#1699 +/- ## == - Coverage 18.23% 18.21% -0.02% + Complexity857 856 -1 == Files 348 348 Lines 1534615346 Branches 1524 1524 == - Hits 2798 2796 -2 - Misses 1219112193 +2 Partials 357 357 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/common/fs/HoodieWrapperFileSystem.java](https://codecov.io/gh/apache/hudi/pull/1699/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0hvb2RpZVdyYXBwZXJGaWxlU3lzdGVtLmphdmE=) | `21.98% <0.00%> (-0.71%)` | `28.00% <0.00%> (-1.00%)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=footer). Last update [6a0f419...3340a10](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1699: [HUDI-989]Support long options for prepare_integration_suite
codecov-commenter edited a comment on pull request #1699: URL: https://github.com/apache/hudi/pull/1699#issuecomment-638613017 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=h1) Report > Merging [#1699](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=desc) into [hudi_test_suite_refactor](https://codecov.io/gh/apache/hudi/commit/6a0f4191ac34fc393d72f36530c8573273d4d045&el=desc) will **decrease** coverage by `0.01%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1699/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## hudi_test_suite_refactor#1699 +/- ## == - Coverage 18.23% 18.21% -0.02% + Complexity857 856 -1 == Files 348 348 Lines 1534615346 Branches 1524 1524 == - Hits 2798 2796 -2 - Misses 1219112193 +2 Partials 357 357 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/common/fs/HoodieWrapperFileSystem.java](https://codecov.io/gh/apache/hudi/pull/1699/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0hvb2RpZVdyYXBwZXJGaWxlU3lzdGVtLmphdmE=) | `21.98% <0.00%> (-0.71%)` | `28.00% <0.00%> (-1.00%)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=footer). Last update [6a0f419...3340a10](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #1699: [HUDI-989]Support long options for prepare_integration_suite
codecov-commenter commented on pull request #1699: URL: https://github.com/apache/hudi/pull/1699#issuecomment-638613017 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=h1) Report > Merging [#1699](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=desc) into [hudi_test_suite_refactor](https://codecov.io/gh/apache/hudi/commit/6a0f4191ac34fc393d72f36530c8573273d4d045&el=desc) will **decrease** coverage by `0.01%`. > The diff coverage is `n/a`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1699/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## hudi_test_suite_refactor#1699 +/- ## == - Coverage 18.23% 18.21% -0.02% + Complexity857 856 -1 == Files 348 348 Lines 1534615346 Branches 1524 1524 == - Hits 2798 2796 -2 - Misses 1219112193 +2 Partials 357 357 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...apache/hudi/common/fs/HoodieWrapperFileSystem.java](https://codecov.io/gh/apache/hudi/pull/1699/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0hvb2RpZVdyYXBwZXJGaWxlU3lzdGVtLmphdmE=) | `21.98% <0.00%> (-0.71%)` | `28.00% <0.00%> (-1.00%)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=footer). Last update [6a0f419...3340a10](https://codecov.io/gh/apache/hudi/pull/1699?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.00%`. > The diff coverage is `33.33%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#1697 +/- ## = Coverage 18.18% 18.19% - Complexity 856 857+1 = Files 348 348 Lines 1535115358+7 Branches 1524 1525+1 = + Hits 2792 2794+2 - Misses1220212206+4 - Partials357 358+1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...on/rollback/CopyOnWriteRollbackActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL3JvbGxiYWNrL0NvcHlPbldyaXRlUm9sbGJhY2tBY3Rpb25FeGVjdXRvci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [...on/rollback/MergeOnReadRollbackActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL3JvbGxiYWNrL01lcmdlT25SZWFkUm9sbGJhY2tBY3Rpb25FeGVjdXRvci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | | | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...7ec34e0](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-994) Identify functional tests that are convertible to unit tests with mocks
[ https://issues.apache.org/jira/browse/HUDI-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-994: Description: * Identify convertible functional tests and re-implement by using mock * remove/merge duplicate/overlapping functional tests if possible was:Identify convertible functional tests and re-implement by using mock > Identify functional tests that are convertible to unit tests with mocks > --- > > Key: HUDI-994 > URL: https://issues.apache.org/jira/browse/HUDI-994 > Project: Apache Hudi > Issue Type: Sub-task > Components: Testing >Reporter: Raymond Xu >Priority: Major > > * Identify convertible functional tests and re-implement by using mock > * remove/merge duplicate/overlapping functional tests if possible -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-781) Re-design test utilities
[ https://issues.apache.org/jira/browse/HUDI-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-781: Status: Open (was: New) > Re-design test utilities > > > Key: HUDI-781 > URL: https://issues.apache.org/jira/browse/HUDI-781 > Project: Apache Hudi > Issue Type: Test > Components: Testing >Reporter: Raymond Xu >Priority: Major > > Test utility classes are to re-designed with considerations like > * Use more mockings > * Reduce spark context setup > * Improve/clean up data generator > An RFC would be preferred for illustrating the design work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-995) Add hudi-testutils module
[ https://issues.apache.org/jira/browse/HUDI-995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-995: Description: * add a new module {{hudi-testutils}} and add it to all other modules as test dep and remove {{hudi-common}} etc from test dep list * selectively migrate test util classes like data gen to {{hudi-testutils}} * provide utils to be able generalize base file/log file style testing. was: * add a new module {{hudi-testutils}} and add it to all other modules as test dep and remove {{hudi-common}} etc from test dep list * selectively migrate test util classes like data gen to {{hudi-testutils}} > Add hudi-testutils module > - > > Key: HUDI-995 > URL: https://issues.apache.org/jira/browse/HUDI-995 > Project: Apache Hudi > Issue Type: Sub-task > Components: Testing >Reporter: Raymond Xu >Priority: Major > > * add a new module {{hudi-testutils}} and add it to all other modules as test > dep and remove {{hudi-common}} etc from test dep list > * selectively migrate test util classes like data gen to {{hudi-testutils}} > * provide utils to be able generalize base file/log file style testing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-996) Use shared spark session provider
Raymond Xu created HUDI-996: --- Summary: Use shared spark session provider Key: HUDI-996 URL: https://issues.apache.org/jira/browse/HUDI-996 Project: Apache Hudi Issue Type: Sub-task Components: Testing Reporter: Raymond Xu * implement a shared spark session provider to be used for test suites, setup and tear down less spark sessions and other mini servers * add functional tests with similar setup logic to test suites, to make use of shared spark session -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-995) Add hudi-testutils module
Raymond Xu created HUDI-995: --- Summary: Add hudi-testutils module Key: HUDI-995 URL: https://issues.apache.org/jira/browse/HUDI-995 Project: Apache Hudi Issue Type: Sub-task Components: Testing Reporter: Raymond Xu * add a new module {{hudi-testutils}} and add it to all other modules as test dep and remove {{hudi-common}} etc from test dep list * selectively migrate test util classes like data gen to {{hudi-testutils}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-994) Identify functional tests that are convertible to unit tests with mocks
Raymond Xu created HUDI-994: --- Summary: Identify functional tests that are convertible to unit tests with mocks Key: HUDI-994 URL: https://issues.apache.org/jira/browse/HUDI-994 Project: Apache Hudi Issue Type: Sub-task Components: Testing Reporter: Raymond Xu Identify convertible functional tests and re-implement by using mock -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-896) Parallelize CI testing to reduce CI wait time
[ https://issues.apache.org/jira/browse/HUDI-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-896: Parent: HUDI-781 Issue Type: Sub-task (was: Improvement) > Parallelize CI testing to reduce CI wait time > - > > Key: HUDI-896 > URL: https://issues.apache.org/jira/browse/HUDI-896 > Project: Apache Hudi > Issue Type: Sub-task > Components: Testing >Reporter: Raymond Xu >Assignee: Raymond Xu >Priority: Major > Labels: pull-request-available > Fix For: 0.6.0 > > > - > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] lamber-ken commented on pull request #1469: [HUDI-686] Implement BloomIndexV2 that does not depend on memory caching
lamber-ken commented on pull request #1469: URL: https://github.com/apache/hudi/pull/1469#issuecomment-638574852 > @lamber-ken : LMK once the patch is ready to be reviewed again. Big thanks for reviewing this pr very much. Sorry for the delay, I'm working on something else, when I'm ready, will ping you 👍 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashanthpdesai closed issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai closed issue #1695: URL: https://github.com/apache/hudi/issues/1695 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashanthpdesai edited a comment on issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai edited a comment on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638572601 @bvaradar: Thank you for the clarification. i am able to read the hudi parquet files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashanthpdesai commented on issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai commented on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638572601 @bvaradar: Thank you for the clarification. i am able to read the parquet files. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on a change in pull request #1702: Bootstrap datasource changes
garyli1019 commented on a change in pull request #1702: URL: https://github.com/apache/hudi/pull/1702#discussion_r434962196 ## File path: hudi-spark/src/main/scala/org/apache/hudi/HudiBootstrapRelation.scala ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hudi + +import org.apache.hadoop.fs.Path +import org.apache.hudi.common.model.HoodieBaseFile +import org.apache.hudi.common.table.{HoodieTableMetaClient, TableSchemaResolver} +import org.apache.hudi.common.table.view.HoodieTableFileSystemView +import org.apache.hudi.exception.HoodieException +import org.apache.spark.internal.Logging +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.execution.datasources.PartitionedFile +import org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat +import org.apache.spark.sql.{Row, SQLContext} +import org.apache.spark.sql.sources.{BaseRelation, Filter, PrunedFilteredScan} +import org.apache.spark.sql.types.StructType + +import scala.collection.JavaConverters._ + +/** + * This is Spark relation that can be used for querying metadata/fully bootstrapped query hudi tables, as well as + * non-bootstrapped tables. It implements PrunedFilteredScan interface in order to support column pruning and filter + * push-down. For metadata bootstrapped files, if we query columns from both metadata and actual data then it will + * perform a merge of both to return the result. + * + * Caveat: Filter push-down does not work when querying both metadata and actual data columns over metadata + * bootstrapped files, because then the metadata file and data file can return different number of rows causing errors + * merging. + * + * @param _sqlContext Spark SQL Context + * @param userSchema User specified schema in the datasource query + * @param globPaths Globbed paths obtained from the user provided path for querying + * @param metaClient Hudi table meta client + * @param optParams DataSource options passed by the user + */ +class HudiBootstrapRelation(@transient val _sqlContext: SQLContext, +val userSchema: StructType, +val globPaths: Seq[Path], +val metaClient: HoodieTableMetaClient, +val optParams: Map[String, String]) extends BaseRelation + with PrunedFilteredScan with Logging { + + val skeletonSchema: StructType = HudiSparkUtils.getHudiMetadataSchema + var dataSchema: StructType = _ + var fullSchema: StructType = _ + + val fileIndex: HudiBootstrapFileIndex = buildFileIndex() + + override def sqlContext: SQLContext = _sqlContext + + override val needConversion: Boolean = false + + override def schema: StructType = inferFullSchema() + + override def buildScan(requiredColumns: Array[String], filters: Array[Filter]): RDD[Row] = { +logInfo("Starting scan..") + +// Compute splits +val bootstrapSplits = fileIndex.files.map(hoodieBaseFile => { + var skeletonFile: Option[PartitionedFile] = Option.empty + var dataFile: PartitionedFile = null + + if (hoodieBaseFile.getExternalBaseFile.isPresent) { +skeletonFile = Option(PartitionedFile(InternalRow.empty, hoodieBaseFile.getPath, 0, hoodieBaseFile.getFileLen)) +dataFile = PartitionedFile(InternalRow.empty, hoodieBaseFile.getExternalBaseFile.get().getPath, 0, + hoodieBaseFile.getExternalBaseFile.get().getFileLen) + } else { +dataFile = PartitionedFile(InternalRow.empty, hoodieBaseFile.getPath, 0, hoodieBaseFile.getFileLen) + } + HudiBootstrapSplit(dataFile, skeletonFile) +}) +val tableState = HudiBootstrapTableState(bootstrapSplits) + +// Get required schemas for column pruning +var requiredDataSchema = StructType(Seq()) +var requiredSkeletonSchema = StructType(Seq()) +requiredColumns.foreach(col => { + var field = dataSchema.find(_.name == col) + if (field.isDefined) { +requiredDataSchema = requiredDataSchema.add(field.get) + } else { +field = skeletonSchema.find(_.name == col) +requiredSkeletonSchema = requiredSkeletonSchem
[GitHub] [hudi] bvaradar commented on issue #1695: [SUPPORT] : Global Bloom Index config issue
bvaradar commented on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638568039 @prashanthpdesai : You should be using spark.read.format("hudi") instead of parquet to read Hudi datasets. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #1469: [HUDI-686] Implement BloomIndexV2 that does not depend on memory caching
nsivabalan commented on pull request #1469: URL: https://github.com/apache/hudi/pull/1469#issuecomment-638566215 @lamber-ken : LMK once the patch is ready to be reviewed again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 edited a comment on pull request #1602: [HUDI-494] fix incorrect record size estimation
garyli1019 edited a comment on pull request #1602: URL: https://github.com/apache/hudi/pull/1602#issuecomment-638550734 > are you able to verify this patch fixes the issues in your prod though? @vinothchandar Yes, worked as expected. It will skip the small commit. Edit: Added a unit test to cover this case as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] kwondw opened a new pull request #1703: [HUDI-993] Let delete API use "hoodie.delete.shuffle.parallelism"
kwondw opened a new pull request #1703: URL: https://github.com/apache/hudi/pull/1703 ## What is the purpose of the pull request For Delete API, I noticed "hoodie.delete.shuffle.parallelism" isn't used as opposed to "hoodie.upsert.shuffle.parallelism" is used for [upsert](https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/table/action/commit/WriteHelper.java#L104), this creates the performance difference between delete by upsert API with "EmptyHoodieRecordPayload" and delete API for certain cases. https://issues.apache.org/jira/browse/HUDI-993 has more detail. ## Brief change log * Let [deduplicateKeys](https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/table/action/commit/DeleteHelper.java#L51-L57) method use "hoodie.upsert.shuffle.parallelism" * Repartition inputRDD as "hoodie.upsert.shuffle.parallelism" in case "hoodie.combine.before.delete=false" ## Verify this pull request This change added tests and can be verified as follows: ## Committer checklist - [X] Has a corresponding JIRA in PR title & commit - [X] Commit message is descriptive of the change - [ ] CI is green - [X] Necessary doc changes done or have another open PR - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-993) Use hoodie.delete.shuffle.parallelism for Delete API
[ https://issues.apache.org/jira/browse/HUDI-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-993: Labels: pull-request-available (was: ) > Use hoodie.delete.shuffle.parallelism for Delete API > > > Key: HUDI-993 > URL: https://issues.apache.org/jira/browse/HUDI-993 > Project: Apache Hudi > Issue Type: Improvement > Components: Performance >Reporter: Dongwook Kwon >Priority: Minor > Labels: pull-request-available > > While HUDI-328 introduced Delete API, I noticed > [deduplicateKeys|https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/table/action/commit/DeleteHelper.java#L51-L57] > method doesn't allow any parallelism for RDD operation while > [deduplicateRecords|https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/table/action/commit/WriteHelper.java#L104] > for upsert uses parallelism on RDD. > {{And "hoodie.delete.shuffle.parallelism" doesn't seem to be used.}} > > I found certain cases, like input RDD has few parallelism but target table > has large files, certain Spark job's performance is suffered from low > parallelism. so in this case, upsert performance with > "EmptyHoodieRecordPayload" is faster than delete API. > Also this is due to the fact that "hoodie.combine.before.upsert" is true by > default, when it's not enabled, the issue would be the same. > So I wonder input RDD should be repartition as > "hoodie.delete.shuffle.parallelism" when " hoodie.combine.before.delete" is > false for better performance regardless of "hoodie.combine.before.delete" > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] garyli1019 commented on pull request #1602: [HUDI-494] fix incorrect record size estimation
garyli1019 commented on pull request #1602: URL: https://github.com/apache/hudi/pull/1602#issuecomment-638550734 > are you able to verify this patch fixes the issues in your prod though? @vinothchandar Yes, worked as expected. It will skip the small commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-993) Use hoodie.delete.shuffle.parallelism for Delete API
Dongwook Kwon created HUDI-993: -- Summary: Use hoodie.delete.shuffle.parallelism for Delete API Key: HUDI-993 URL: https://issues.apache.org/jira/browse/HUDI-993 Project: Apache Hudi Issue Type: Improvement Components: Performance Reporter: Dongwook Kwon While HUDI-328 introduced Delete API, I noticed [deduplicateKeys|https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/table/action/commit/DeleteHelper.java#L51-L57] method doesn't allow any parallelism for RDD operation while [deduplicateRecords|https://github.com/apache/hudi/blob/master/hudi-client/src/main/java/org/apache/hudi/table/action/commit/WriteHelper.java#L104] for upsert uses parallelism on RDD. {{And "hoodie.delete.shuffle.parallelism" doesn't seem to be used.}} I found certain cases, like input RDD has few parallelism but target table has large files, certain Spark job's performance is suffered from low parallelism. so in this case, upsert performance with "EmptyHoodieRecordPayload" is faster than delete API. Also this is due to the fact that "hoodie.combine.before.upsert" is true by default, when it's not enabled, the issue would be the same. So I wonder input RDD should be repartition as "hoodie.delete.shuffle.parallelism" when " hoodie.combine.before.delete" is false for better performance regardless of "hoodie.combine.before.delete" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type
[ https://issues.apache.org/jira/browse/HUDI-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra updated HUDI-992: --- Parent: HUDI-242 Issue Type: Sub-task (was: Bug) > For hive-style partitioned source data, partition columns synced with Hive > will always have String type > --- > > Key: HUDI-992 > URL: https://issues.apache.org/jira/browse/HUDI-992 > Project: Apache Hudi > Issue Type: Sub-task >Reporter: Udit Mehrotra >Priority: Major > > Currently bootstrap implementation is not able to handle partition columns > correctly when the source data has *hive-style partitioning*, as is also > mentioned in https://jira.apache.org/jira/browse/HUDI-915 > The schema inferred while performing bootstrap and stored in the commit > metadata does not have partition column schema(in case of hive partitioned > data). As a result during hive-sync when hudi tries to determine the type of > partition column from that schema, it would not find it and assume the > default data type *string*. > Here is where partition column schema is determined for hive-sync: > [https://github.com/apache/hudi/blob/master/hudi-hive-sync/src/main/java/org/apache/hudi/hive/util/HiveSchemaUtil.java#L417] > > Thus no matter what the data type of partition column is in the source data > (atleast what spark infers it as from the path), it will always be synced as > string. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type
Udit Mehrotra created HUDI-992: -- Summary: For hive-style partitioned source data, partition columns synced with Hive will always have String type Key: HUDI-992 URL: https://issues.apache.org/jira/browse/HUDI-992 Project: Apache Hudi Issue Type: Bug Reporter: Udit Mehrotra Currently bootstrap implementation is not able to handle partition columns correctly when the source data has *hive-style partitioning*, as is also mentioned in https://jira.apache.org/jira/browse/HUDI-915 The schema inferred while performing bootstrap and stored in the commit metadata does not have partition column schema(in case of hive partitioned data). As a result during hive-sync when hudi tries to determine the type of partition column from that schema, it would not find it and assume the default data type *string*. Here is where partition column schema is determined for hive-sync: [https://github.com/apache/hudi/blob/master/hudi-hive-sync/src/main/java/org/apache/hudi/hive/util/HiveSchemaUtil.java#L417] Thus no matter what the data type of partition column is in the source data (atleast what spark infers it as from the path), it will always be synced as string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] vinothchandar commented on pull request #1602: [HUDI-494] fix incorrect record size estimation
vinothchandar commented on pull request #1602: URL: https://github.com/apache/hudi/pull/1602#issuecomment-638534258 Reviewing again .. are you able to verify this patch fixes the issues in your prod though? Seems like a good thing to do.. In general it’s good to be verifying in parallel without blocking on reviews here 😎 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HUDI-991) Bootstrap Implementation Bugs
Udit Mehrotra created HUDI-991: -- Summary: Bootstrap Implementation Bugs Key: HUDI-991 URL: https://issues.apache.org/jira/browse/HUDI-991 Project: Apache Hudi Issue Type: Sub-task Reporter: Udit Mehrotra This story tracks all the bugs we encounter while testing bootstrap changes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] umehrot2 commented on pull request #1475: [HUDI-426][WIP] Initial implementation for Bootstrapping data source
umehrot2 commented on pull request #1475: URL: https://github.com/apache/hudi/pull/1475#issuecomment-638522351 Closing this pull request, in favor of the new pull request https://github.com/apache/hudi/pull/1702 where I have consolidated all the datasource related changes in one PR for review. It includes this read datasource part as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] umehrot2 closed pull request #1475: [HUDI-426][WIP] Initial implementation for Bootstrapping data source
umehrot2 closed pull request #1475: URL: https://github.com/apache/hudi/pull/1475 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] umehrot2 opened a new pull request #1702: Bootstrap datasource changes
umehrot2 opened a new pull request #1702: URL: https://github.com/apache/hudi/pull/1702 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *(For example: This pull request adds quick-start document.)* ## Brief change log *(for example:)* - *Modify AnnotationLocation checkstyle rule in checkstyle.xml* ## Verify this pull request *(Please pick either of the following options)* This pull request is a trivial rework / code cleanup without any test coverage. *(or)* This pull request is already covered by existing tests, such as *(please describe tests)*. (or) This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end.* - *Added HoodieClientWriteTest to verify the change.* - *Manually verified the change by running a job locally.* ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on issue #1670: Error opening Hive split: Unknown converted type TIMESTAMP_MICROS
vinothchandar commented on issue #1670: URL: https://github.com/apache/hudi/issues/1670#issuecomment-638521271 https://issues.apache.org/jira/browse/HUDI-83 Should have all the context This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] vinothchandar commented on issue #1694: Slow Write into Hudi Dataset(MOR)
vinothchandar commented on issue #1694: URL: https://github.com/apache/hudi/issues/1694#issuecomment-638519702 Beyond the initial shuffle, hudi will auto tune everything so I am not surprised. On countByKey at HoodieBloomindex, what’s the line number? count at HoodieSparkSqlWriter, is actual writing of data. We send 100K records to the same insert partition to write larger file sizes. Can you see if there’s a skew in that stage? It’s tunable This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1701: [HUDI-990] Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
codecov-commenter edited a comment on pull request #1701: URL: https://github.com/apache/hudi/pull/1701#issuecomment-638466637 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=h1) Report > Merging [#1701](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=desc) into [release-0.5.3](https://codecov.io/gh/apache/hudi/commit/5fcc461647e197e805836c6aea24e9df8c09cf0f&el=desc) will **decrease** coverage by `0.07%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1701/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## release-0.5.3#1701 +/- ## === - Coverage69.87% 69.80% -0.08% + Complexity 993 204 -789 === Files 322 322 Lines1551415521 +7 Branches 1602 1603 +1 === - Hits 1084110834 -7 - Misses3958 3972 +14 Partials 715 715 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `75.00% <100.00%> (+0.71%)` | `0.00 <0.00> (-7.00)` | :arrow_up: | | [.../org/apache/hudi/table/HoodieCopyOnWriteTable.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvSG9vZGllQ29weU9uV3JpdGVUYWJsZS5qYXZh) | `90.71% <100.00%> (+0.02%)` | `0.00 <0.00> (-9.00)` | :arrow_up: | | [.../org/apache/hudi/table/HoodieMergeOnReadTable.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvSG9vZGllTWVyZ2VPblJlYWRUYWJsZS5qYXZh) | `85.71% <100.00%> (+0.08%)` | `0.00 <0.00> (ø)` | | | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `92.18% <100.00%> (ø)` | `0.00 <0.00> (-29.00)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `85.07% <100.00%> (+0.94%)` | `0.00 <0.00> (-8.00)` | :arrow_up: | | [...g/apache/hudi/exception/HoodieRemoteException.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZVJlbW90ZUV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | [...on/table/view/RemoteHoodieTableFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvUmVtb3RlSG9vZGllVGFibGVGaWxlU3lzdGVtVmlldy5qYXZh) | `77.59% <0.00%> (-5.47%)` | `0.00% <0.00%> (-6.00%)` | | | [...e/hudi/timeline/service/FileSystemViewHandler.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvRmlsZVN5c3RlbVZpZXdIYW5kbGVyLmphdmE=) | `89.20% <0.00%> (-2.35%)` | `0.00% <0.00%> (-11.00%)` | | | [.../hudi/common/table/view/FileSystemViewManager.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdNYW5hZ2VyLmphdmE=) | `84.48% <0.00%> (+3.44%)` | `0.00% <0.00%> (-13.00%)` | :arrow_up: | | [...n/java/org/apache/hudi/common/model/HoodieKey.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUtleS5qYXZh) | `94.44% <0.00%> (+5.55%)` | `0.00% <0.00%> (-4.00%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codec
[GitHub] [hudi] codecov-commenter commented on pull request #1701: [HUDI-990] Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
codecov-commenter commented on pull request #1701: URL: https://github.com/apache/hudi/pull/1701#issuecomment-638466637 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=h1) Report > Merging [#1701](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=desc) into [release-0.5.3](https://codecov.io/gh/apache/hudi/commit/5fcc461647e197e805836c6aea24e9df8c09cf0f&el=desc) will **decrease** coverage by `0.07%`. > The diff coverage is `100.00%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1701/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## release-0.5.3#1701 +/- ## === - Coverage69.87% 69.80% -0.08% + Complexity 993 204 -789 === Files 322 322 Lines1551415521 +7 Branches 1602 1603 +1 === - Hits 1084110834 -7 - Misses3958 3972 +14 Partials 715 715 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `75.00% <100.00%> (+0.71%)` | `0.00 <0.00> (-7.00)` | :arrow_up: | | [.../org/apache/hudi/table/HoodieCopyOnWriteTable.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvSG9vZGllQ29weU9uV3JpdGVUYWJsZS5qYXZh) | `90.71% <100.00%> (+0.02%)` | `0.00 <0.00> (-9.00)` | :arrow_up: | | [.../org/apache/hudi/table/HoodieMergeOnReadTable.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvSG9vZGllTWVyZ2VPblJlYWRUYWJsZS5qYXZh) | `85.71% <100.00%> (+0.08%)` | `0.00 <0.00> (ø)` | | | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `92.18% <100.00%> (ø)` | `0.00 <0.00> (-29.00)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `85.07% <100.00%> (+0.94%)` | `0.00 <0.00> (-8.00)` | :arrow_up: | | [...g/apache/hudi/exception/HoodieRemoteException.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZVJlbW90ZUV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | | | [...on/table/view/RemoteHoodieTableFileSystemView.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvUmVtb3RlSG9vZGllVGFibGVGaWxlU3lzdGVtVmlldy5qYXZh) | `77.59% <0.00%> (-5.47%)` | `0.00% <0.00%> (-6.00%)` | | | [...e/hudi/timeline/service/FileSystemViewHandler.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvRmlsZVN5c3RlbVZpZXdIYW5kbGVyLmphdmE=) | `89.20% <0.00%> (-2.35%)` | `0.00% <0.00%> (-11.00%)` | | | [.../hudi/common/table/view/FileSystemViewManager.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdNYW5hZ2VyLmphdmE=) | `84.48% <0.00%> (+3.44%)` | `0.00% <0.00%> (-13.00%)` | :arrow_up: | | [...n/java/org/apache/hudi/common/model/HoodieKey.java](https://codecov.io/gh/apache/hudi/pull/1701/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUtleS5qYXZh) | `94.44% <0.00%> (+5.55%)` | `0.00% <0.00%> (-4.00%)` | :arrow_up: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1701?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/g
[jira] [Commented] (HUDI-983) Add Metrics section to asf-site
[ https://issues.apache.org/jira/browse/HUDI-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125171#comment-17125171 ] Raymond Xu commented on HUDI-983: - [~shenhong] Sure. Thanks for taking the initiative! > Add Metrics section to asf-site > --- > > Key: HUDI-983 > URL: https://issues.apache.org/jira/browse/HUDI-983 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Reporter: Raymond Xu >Assignee: Hong Shen >Priority: Minor > Labels: documentation, newbie > Fix For: 0.6.0 > > > Document the use of metrics system in Hudi, include all supported metrics > reporter. > See the example > https://user-images.githubusercontent.com/20113411/83055820-f5e97100-a086-11ea-9ea3-52b342aca9d4.png -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] prashanthpdesai edited a comment on issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai edited a comment on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638335917 @nsivabalan : thank you i was able to write it successfully with global index after pointing to newer version of jar and but i see below exception while reading the parquet file . could you please check is that something you can help on this. Not sure why its trying to read .commit file which is causing magic byte exception. spark.read.parquet(basepath+"/*").show(false) **Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:** at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.org$apache$spark$executor$Executor$TaskRunner$$anonfun$$res$1(Executor.scala:412) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:419) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1359) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:430) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) **Caused by: java.io.IOException: Could not read footer for file: FileStatus{path=maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit;** isDirectory=false; length=4366; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:551) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:538) at org.apache.spark.util.ThreadUtils$$anonfun$3$$anonfun$apply$1.apply(ThreadUtils.scala:287) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) **Caused by: java.lang.RuntimeException: maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [32, 48, 10, 125]** info: /basepath/.hoodie/ drwxr-sr-x. 2 xxx xgc0 Jun 3 11:55 archived -rwxr-xr-x. 1 xxx xgc 207 Jun 3 11:55 hoodie.properties -rwxr-xr-x. 1 xxx xgc0 Jun 3 11:55 20200603115556.commit.requested -rwxr-xr-x. 1 xxx xgc 380 Jun 3 11:56 20200603115556.inflight -rwxr-xr-x. 1 xxx xgc 4366 Jun 3 11:56 20200603115556.commit -rwxr-xr-x. 1 xxx xgc0 Jun 3 11:57 20200603115719.commit.requested -rwxr-xr-x. 1 xxx xgc 380 Jun 3 11:57 20200603115719.inflight -rwxr-xr-x. 1 xxx xgc 5906 Jun 3 11:57 20200603115719.commit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashanthpdesai edited a comment on issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai edited a comment on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638335917 @nsivabalan : thank you i was able to write it successfully with global index after pointing to newer version of jar and but i see below exception while reading the parquet file . could you please check is that something you can help on this. spark.read.parquet(basepath+"/*").show(false) **Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:** at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.org$apache$spark$executor$Executor$TaskRunner$$anonfun$$res$1(Executor.scala:412) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:419) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1359) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:430) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) **Caused by: java.io.IOException: Could not read footer for file: FileStatus{path=maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit;** isDirectory=false; length=4366; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:551) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:538) at org.apache.spark.util.ThreadUtils$$anonfun$3$$anonfun$apply$1.apply(ThreadUtils.scala:287) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) **Caused by: java.lang.RuntimeException: maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [32, 48, 10, 125]** info: /basepath/.hoodie/ drwxr-sr-x. 2 xxx xgc0 Jun 3 11:55 archived -rwxr-xr-x. 1 xxx xgc 207 Jun 3 11:55 hoodie.properties -rwxr-xr-x. 1 xxx xgc0 Jun 3 11:55 20200603115556.commit.requested -rwxr-xr-x. 1 xxx xgc 380 Jun 3 11:56 20200603115556.inflight -rwxr-xr-x. 1 xxx xgc 4366 Jun 3 11:56 20200603115556.commit -rwxr-xr-x. 1 xxx xgc0 Jun 3 11:57 20200603115719.commit.requested -rwxr-xr-x. 1 xxx xgc 380 Jun 3 11:57 20200603115719.inflight -rwxr-xr-x. 1 xxx xgc 5906 Jun 3 11:57 20200603115719.commit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashanthpdesai edited a comment on issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai edited a comment on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638335917 @nsivabalan : thank you i was able to write it successfully with global index after pointing to newer version of jar and but i see below exception while reading the parquet file . could you please check is that something you can help on this. spark.read.parquet(basepath+"/*").show(false) **Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:** at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.org$apache$spark$executor$Executor$TaskRunner$$anonfun$$res$1(Executor.scala:412) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:419) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1359) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:430) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) **Caused by: java.io.IOException: Could not read footer for file: FileStatus{path=maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit;** isDirectory=false; length=4366; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:551) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:538) at org.apache.spark.util.ThreadUtils$$anonfun$3$$anonfun$apply$1.apply(ThreadUtils.scala:287) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) **Caused by: java.lang.RuntimeException: maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [32, 48, 10, 125]** This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] prashanthpdesai commented on issue #1695: [SUPPORT] : Global Bloom Index config issue
prashanthpdesai commented on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638335917 @nsivabalan : thank you i was able to write it successfully with global index after pointing to newer version of jar and but i see below exception while reading the parquet file . could you please check is that something you can help on this. spark.read.parquet(basepath+"/*").show(false) **Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:** at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226) at org.apache.spark.util.ThreadUtils$.parmap(ThreadUtils.scala:290) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.readParquetFootersInParallel(ParquetFileFormat.scala:538) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$9.apply(ParquetFileFormat.scala:611) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$9.apply(ParquetFileFormat.scala:603) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:123) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.org$apache$spark$executor$Executor$TaskRunner$$anonfun$$res$1(Executor.scala:412) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:419) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1359) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:430) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) **Caused by: java.io.IOException: Could not read footer for file: FileStatus{path=maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit;** isDirectory=false; length=4366; replication=0; blocksize=0; modification_time=0; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:551) at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readParquetFootersInParallel$1.apply(ParquetFileFormat.scala:538) at org.apache.spark.util.ThreadUtils$$anonfun$3$$anonfun$apply$1.apply(ThreadUtils.scala:287) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) **Caused by: java.lang.RuntimeException: maprfs:///datalake/globalndextest0604/.hoodie/20200603115556.commit is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [32, 48, 10, 125]** This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] garyli1019 commented on pull request #1602: [HUDI-494] fix incorrect record size estimation
garyli1019 commented on pull request #1602: URL: https://github.com/apache/hudi/pull/1602#issuecomment-638335182 @vinothchandar @bvaradar @nsivabalan Any thoughts on this PR? This bug is happening quite often in my production. One small commit will screw up the table. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.02%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master#1697 +/- ## + Coverage 18.18% 18.20% +0.02% - Complexity 856 858 +2 Files 348 348 Lines 1535115356 +5 Branches 1524 1525 +1 + Hits 2792 2796 +4 Misses1220212202 - Partials357 358 +1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | | [...apache/hudi/common/fs/HoodieWrapperFileSystem.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL0hvb2RpZVdyYXBwZXJGaWxlU3lzdGVtLmphdmE=) | `22.69% <0.00%> (+0.70%)` | `29.00% <0.00%> (+1.00%)` | | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...cb950c1](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #1695: [SUPPORT] : Global Bloom Index config issue
nsivabalan commented on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638300159 ``` spark.read.parquet(basepath+"/*").show(false) +---++--+--++--+--+--++ |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name |fanme |lname |ts |uuid| +---++--+--++--+--+--++ |20200603155652 |20200603155652_0_2 |20|2020-01-30 |a9e4f829-1a0d-49e0-9ed5-254808a4a4bf-0_0-22-12006_20200603155652.parquet|prabil|bal |2020-01-30|20 | |20200603155652 |20200603155652_2_1 |10|2019-10-15 |0e790488-ebf3-479a-9044-d819b620d085-0_2-22-12008_20200603155652.parquet|pd |desai1|2019-10-15|10 | |20200603155652 |20200603155652_1_3 |11|2019-10-14 |2ec0b40d-7a44-496b-9c92-68fe920c6111-0_1-22-12007_20200603155652.parquet|pp |sai |2019-10-14|11 | +---++--+--++--+--+--++ ``` After update: ``` spark.read.parquet(basepath+"/*").show(false) +---++--+--++--+-+--++ |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name |fanme |lname|ts |uuid| +---++--+--++--+-+--++ |20200603160032 |20200603160032_1_4 |11|2019-10-18 |1fed9735-8932-4ac3-bdb0-d5e94e23267c-0_1-56-25537_20200603160032.parquet|pp |sai |2019-10-18|11 | |20200603160032 |20200603160032_1_5 |25|2019-10-18 |1fed9735-8932-4ac3-bdb0-d5e94e23267c-0_1-56-25537_20200603160032.parquet|rg |fg |2019-10-18|25 | |20200603155652 |20200603155652_0_2 |20|2020-01-30 |a9e4f829-1a0d-49e0-9ed5-254808a4a4bf-0_0-22-12006_20200603155652.parquet|prabil|bal |2020-01-30|20 | |20200603160032 |20200603160032_0_6 |10|2019-10-17 |17f34208-cb9e-4a39-b796-a7168d3d089a-0_0-56-25536_20200603160032.parquet|pd |desai|2019-10-17|10 | +---++--+--++--+-+--++ ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] nsivabalan commented on issue #1695: [SUPPORT] : Global Bloom Index config issue
nsivabalan commented on issue #1695: URL: https://github.com/apache/hudi/issues/1695#issuecomment-638296488 @prashanthpdesai : I tried and it works for me. I am using 0.5.2-incubating bundle (org.apache.hudi:hudi-spark-bundle_2.11:0.5.2-incubating). Not sure if that makes any diff. Can you try the following and let me know what output you see. ``` val table ="hudi_cow1" val basepath="/datalake/globalndextest" val df3=spark.read.option("header","true").csv("/datalake/888/test3.csv" val dfh4=df3.write.format("org.apache.hudi").option(RECORDKEY_FIELD_OPT_KEY, "uuid").option(PARTITIONPATH_FIELD_OPT_KEY,"ts").option("hoodie.index.type","GLOBAL_BLOOM").option("hoodie.bloom.index.update.partition.path","true").option(TABLE_NAME,table) dfh4.mode(Append).save(basepath) spark.read.parquet(basepath+"/*").show(false) val df5 = spark.read.option("header","true").csv("/datalake/888/test4.csv" val dfh6 = df5.write.format("org.apache.hudi").option(RECORDKEY_FIELD_OPT_KEY, "uuid").option(PARTITIONPATH_FIELD_OPT_KEY,"ts").option("hoodie.index.type","GLOBAL_BLOOM").option("hoodie.bloom.index.update.partition.path","true").option(TABLE_NAME,table) dfh6.mode(Append).save(basepath) spark.read.parquet(basepath+"/*").show(false) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HUDI-983) Add Metrics section to asf-site
[ https://issues.apache.org/jira/browse/HUDI-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Shen reassigned HUDI-983: -- Assignee: Hong Shen > Add Metrics section to asf-site > --- > > Key: HUDI-983 > URL: https://issues.apache.org/jira/browse/HUDI-983 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Reporter: Raymond Xu >Assignee: Hong Shen >Priority: Minor > Labels: documentation, newbie > Fix For: 0.6.0 > > > Document the use of metrics system in Hudi, include all supported metrics > reporter. > See the example > https://user-images.githubusercontent.com/20113411/83055820-f5e97100-a086-11ea-9ea3-52b342aca9d4.png -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HUDI-983) Add Metrics section to asf-site
[ https://issues.apache.org/jira/browse/HUDI-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125009#comment-17125009 ] Hong Shen commented on HUDI-983: [~rxu] I am interested in it, I will add a pull request in this weekend. > Add Metrics section to asf-site > --- > > Key: HUDI-983 > URL: https://issues.apache.org/jira/browse/HUDI-983 > Project: Apache Hudi > Issue Type: Improvement > Components: Docs >Reporter: Raymond Xu >Priority: Minor > Labels: documentation, newbie > Fix For: 0.6.0 > > > Document the use of metrics system in Hudi, include all supported metrics > reporter. > See the example > https://user-images.githubusercontent.com/20113411/83055820-f5e97100-a086-11ea-9ea3-52b342aca9d4.png -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] xushiyan commented on pull request #1698: [HUDI-986] Support staging site for per pull request
xushiyan commented on pull request #1698: URL: https://github.com/apache/hudi/pull/1698#issuecomment-638229425 > @xushiyan are you able to help review this Sure i can help with this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-957) Umbrella ticket for sequencing common tasks required to progress/unblock RFC-08, RFC-15 & RFC-19
[ https://issues.apache.org/jira/browse/HUDI-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124926#comment-17124926 ] Hong Shen commented on HUDI-957: [~nishith29] We also want to do this, please @ me if need. > Umbrella ticket for sequencing common tasks required to progress/unblock > RFC-08, RFC-15 & RFC-19 > > > Key: HUDI-957 > URL: https://issues.apache.org/jira/browse/HUDI-957 > Project: Apache Hudi > Issue Type: New Feature > Components: Common Core, Compaction, Index, Storage Management >Reporter: Nishith Agarwal >Assignee: Nishith Agarwal >Priority: Major > > There are 3 different designs proposed in following RFC's 08, 15 & 19. On > further analysis there are a bunch of common changes that will benefit all 3 > and some that are specific to each individual design. This ticket is to track > most of the common changes so those can be parallelized and all members in > the community can help contribute to land these soon. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.00%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#1697 +/- ## = Coverage 18.18% 18.19% - Complexity 856 857+1 = Files 348 348 Lines 1535115356+5 Branches 1524 1525+1 = + Hits 2792 2794+2 - Misses1220212204+2 - Partials357 358+1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...cb950c1](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.00%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#1697 +/- ## = Coverage 18.18% 18.19% - Complexity 856 857+1 = Files 348 348 Lines 1535115356+5 Branches 1524 1525+1 = + Hits 2792 2794+2 - Misses1220212204+2 - Partials357 358+1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...cb950c1](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.00%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#1697 +/- ## = Coverage 18.18% 18.19% - Complexity 856 857+1 = Files 348 348 Lines 1535115356+5 Branches 1524 1525+1 = + Hits 2792 2794+2 - Misses1220212204+2 - Partials357 358+1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...cb950c1](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.00%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#1697 +/- ## = Coverage 18.18% 18.19% - Complexity 856 857+1 = Files 348 348 Lines 1535115356+5 Branches 1524 1525+1 = + Hits 2792 2794+2 - Misses1220212204+2 - Partials357 358+1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...cb950c1](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
codecov-commenter edited a comment on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-637484534 # [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=h1) Report > Merging [#1697](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/a9a97d6af47841caaa745497ec425267db0873c8&el=desc) will **increase** coverage by `0.00%`. > The diff coverage is `42.85%`. [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1697/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) ```diff @@Coverage Diff@@ ## master#1697 +/- ## = Coverage 18.18% 18.19% - Complexity 856 857+1 = Files 348 348 Lines 1535115356+5 Branches 1524 1525+1 = + Hits 2792 2794+2 - Misses1220212204+2 - Partials357 358+1 ``` | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `52.30% <0.00%> (ø)` | `28.00 <1.00> (ø)` | | | [.../hudi/client/embedded/EmbeddedTimelineService.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L2VtYmVkZGVkL0VtYmVkZGVkVGltZWxpbmVTZXJ2aWNlLmphdmE=) | `72.22% <50.00%> (-2.07%)` | `7.00 <1.00> (ø)` | | | [...common/table/view/FileSystemViewStorageConfig.java](https://codecov.io/gh/apache/hudi/pull/1697/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3ZpZXcvRmlsZVN5c3RlbVZpZXdTdG9yYWdlQ29uZmlnLmphdmE=) | `62.68% <50.00%> (-0.81%)` | `9.00 <1.00> (+1.00)` | :arrow_down: | -- [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=footer). Last update [a9a97d6...cb950c1](https://codecov.io/gh/apache/hudi/pull/1697?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] bvaradar commented on pull request #1697: [WIP][HUDI-988] Fix issues causing Unit Test Flakiness
bvaradar commented on pull request #1697: URL: https://github.com/apache/hudi/pull/1697#issuecomment-638027639 @vinothchandar : Yes, there are a subset of related changes present in 0.5.3 as well. It would be better to cherry pick once we resolve all issues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] bvaradar commented on pull request #1701: [HUDI-990] Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
bvaradar commented on pull request #1701: URL: https://github.com/apache/hudi/pull/1701#issuecomment-638023012 cc @vinothchandar @nsivabalan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (HUDI-990) Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
[ https://issues.apache.org/jira/browse/HUDI-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-990: Labels: pull-request-available (was: ) > Timeline API : filterCompletedAndCompactionInstants needs to handle requested > state correctly > - > > Key: HUDI-990 > URL: https://issues.apache.org/jira/browse/HUDI-990 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > Labels: pull-request-available > Fix For: 0.6.0, 0.5.3 > > > This bug was causing timeline server API calls during index lookup phase to > fail and backup local view getting constructed. This manifested when new > "requested" state got introduced for commits. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hudi] bvaradar opened a new pull request #1701: [HUDI-990] Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
bvaradar opened a new pull request #1701: URL: https://github.com/apache/hudi/pull/1701 Contains: 1. Code changes to fix HUDI-990 and 2. Fallback for remote file-system view disabled for tests. Unit tests have been made to fail when they cannot get response from remote file system view calls. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HUDI-990) Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
[ https://issues.apache.org/jira/browse/HUDI-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17124707#comment-17124707 ] Balaji Varadarajan commented on HUDI-990: - [~shivnarayan] : FYI : This would need to go to 0.5.3 release > Timeline API : filterCompletedAndCompactionInstants needs to handle requested > state correctly > - > > Key: HUDI-990 > URL: https://issues.apache.org/jira/browse/HUDI-990 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > Fix For: 0.6.0, 0.5.3 > > > This bug was causing timeline server API calls during index lookup phase to > fail and backup local view getting constructed. This manifested when new > "requested" state got introduced for commits. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-990) Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
[ https://issues.apache.org/jira/browse/HUDI-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-990: Status: Open (was: New) > Timeline API : filterCompletedAndCompactionInstants needs to handle requested > state correctly > - > > Key: HUDI-990 > URL: https://issues.apache.org/jira/browse/HUDI-990 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > > This bug was causing timeline server API calls during index lookup phase to > fail and backup local view getting constructed. This manifested when new > "requested" state got introduced for commits. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HUDI-990) Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
[ https://issues.apache.org/jira/browse/HUDI-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-990: --- Assignee: Balaji Varadarajan > Timeline API : filterCompletedAndCompactionInstants needs to handle requested > state correctly > - > > Key: HUDI-990 > URL: https://issues.apache.org/jira/browse/HUDI-990 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > > This bug was causing timeline server API calls during index lookup phase to > fail and backup local view getting constructed. This manifested when new > "requested" state got introduced for commits. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HUDI-990) Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
[ https://issues.apache.org/jira/browse/HUDI-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-990: Fix Version/s: 0.5.3 0.6.0 > Timeline API : filterCompletedAndCompactionInstants needs to handle requested > state correctly > - > > Key: HUDI-990 > URL: https://issues.apache.org/jira/browse/HUDI-990 > Project: Apache Hudi > Issue Type: Bug > Components: Common Core >Reporter: Balaji Varadarajan >Assignee: Balaji Varadarajan >Priority: Major > Fix For: 0.6.0, 0.5.3 > > > This bug was causing timeline server API calls during index lookup phase to > fail and backup local view getting constructed. This manifested when new > "requested" state got introduced for commits. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HUDI-990) Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly
Balaji Varadarajan created HUDI-990: --- Summary: Timeline API : filterCompletedAndCompactionInstants needs to handle requested state correctly Key: HUDI-990 URL: https://issues.apache.org/jira/browse/HUDI-990 Project: Apache Hudi Issue Type: Bug Components: Common Core Reporter: Balaji Varadarajan This bug was causing timeline server API calls during index lookup phase to fail and backup local view getting constructed. This manifested when new "requested" state got introduced for commits. -- This message was sent by Atlassian Jira (v8.3.4#803005)