This is an automated email from the ASF dual-hosted git repository.
vhs pushed a change to branch variant-intro-shredded-avro-support
in repository https://gitbox.apache.org/repos/asf/hudi.git
discard f79a417daee0 feat(schema): Add read + write support for shredded for
AVRO
discard d4ba92bc2233 feat(schema): Config path implemented for spark record
type
discard 99eadafca65f feat(schema): Add support to write shredded variants
discard 8756466024f4 Address comments 2
discard 3fd0ae308e6a Address comments
discard a1f6bd7ef6e8 Make VariantLogicalType compare against singleton
discard 288989ea18e6 Support reading and writing of Variant Types - Add
adapter pattern for Spark3 and 4 - Cleanup invariant issue in SparkSqlWriter -
Add cross engine test - Add backward compatibility test for Spark3.x - Add
cross engine read for Flink
add 0591ee45ae74 feat: add flink HoodieSourceSplitComparator (#18009)
add 1ed2821d2918 feat: Add configurable cleaner policy for metadata table
(#17935)
add 4e51856bf1fc fix: Optimize timeline fetch in HoodieROTablePathFilter
(#17859)
add ada7196d459f fix: Adding tests for rolling back on commits older than
replacecommit and compaction commits (#17932)
add 9db8dcf7205f perf: Reduce memory fetched to driver for
getAllPartitions API by removing unused objects (#17947)
add 84e8324aacb9 fix: include Hoodie metadata fields when reading Parquet
files in precommit validators (#17505)
add a7562a31d65d fix: Allow configurable storage level while computing
expression index update (#17737)
add eafe069c8da9 feat(schema): Update schema repair tools to work on
HoodieSchema (#17952)
add 341cce6242ef [HUDI-2190] Handle null actionState in
LegacyArchivedMetaEntryReader (#18024)
add d87df0f3ea7d feat: publish clean and archival duration metrics in
finally block (#17945)
add 6a9e40cb50a0 fix: enable Hive support when creating JavaSparkContext
for Spark SQL queries (#17510)
add 8b5838791abb feat: enable new source integration in
`HoodieTableSource` (#18022)
add d6eb59677474 feat: Integrate the mdt compaction with existing flink
compaction pipeline (#17991)
add 7510b1b243ec perf: Bloom filter improvements for memory usage (#18015)
add 983c03d098e3 feat: Support slash separated date partitioning for Hudi
tables (#17787)
add 21eb05e75dd1 fix: Use TableSchemaResolver in setWriteSchemaForDeletes
for better schema resolution (#18030)
add 4a7b6230b643 feat(metadata): Handle metadata table service failures
gracefully and emit metrics (#17930)
add 04326111808a fix: allows eager failure from abnormals for streaming
write (#12150)
add df58b8b24014 address feedback (#18063)
add c8dddc0f3b69 fix(utilities): Use passed-in configs when propsFilePath
is null or empty in HoodieStreamer (#17467)
add 00f58313b13b fix: Add config version information to DataSourceOptions
(#17733)
add c29309892a57 fix: Ensure Lance works when populateMetaFields is false
with user defined keygen (#18042)
add 6720849d312e refactor: Add Lombok annotations to hudi-common module
(part 4) (#17830)
add 55510c436489 refactor: Add Lombok annotations to hudi-utilities (Part
2) (#17876)
add ff4da473baf8 fix: reload table config after record index bootstrap to
avoid bloom index fallback (#17508)
add 5b741b6b7da2 refactor: migrate to ScanV2Internal API and remove
ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN config (#17520)
add a59cd1290786 fix: Handle Non-Null Complex Types with Nullable Elements
in ParquetSchemaConverter (#18087)
add 4c819a5dad44 perf: Support lazy clean of the RLI cache during bucket
assigning (#18018)
add 6eaf80929630 fix: correct deleted keys computation in
computeRevivedAndDeletedKeys (#18094)
add 86c5c8151cd5 fix: disable retries in s3/gcs storage lock clients for
storage based LP (#17869)
add 695b9cf52976 feat(schema): Remove direct reliance on Avro for schema
compatibility checks (#18006)
add 3aa3fd1b530a fix: exit transaction with error in storage LP when
unlock failure due to lock acquired by others (#17871)
add 63275b32fdea perf: Avoid re-fetching file status from FS for HFile
readers (#17709)
add 013a1658f4fb feat(schema): Remove usage of migrated AvroSchemaUtils
and HoodieAvroUtils methods (part 1) (#18007)
add ccce6e1ec38b feat: support flink split distribution strategy (#18082)
add d92e99bc74fe feat: Lance schema evolution (add column, type promotion)
(#17904)
add 72c9f6f9f20d feat(schema): Minor cleanup of Avro schema usage (#18043)
add 241450e7fdc2 feat: support partition pruner in Flink hudi source v2
(#18074)
add 14d8894bbc9d refactor: apply lombok for flink source v2 related
classes (#18122)
add 76f8ffbe8bf9 refactor: Add Lombok annotations to hudi-common module
(part 6) (#17880)
add 4c9dcb32bd58 [MINOR] Preload file listing for partitions in BloomIndex
to avoid repeated listings (#17462)
add 19652715142a fix: (table-services) When using multiwriter do not
delete pending rollback plan if exception is thrown while reading it (#18093)
add dbbf86d6160b Adds a guardrail to prevent the creation of the
SparkRDDWriteClient when Spark's speculative execution is enabled (#18045)
add b7f44683551c fix: interrupt storage LP when heartbeat fails (#17870)
add 51a0e81e831a fix: correct unsigned int conversion in
TestProtoConversionUtil (#18120)
add 1ae0d5e9ac97 feat: add flink stream read metrics for hudi source v2
(#18130)
add cabaf50dfb56 [MINOR] Fix HoodieLockMetrics.createTimerForMetrics to
not share metric timer (#18097)
add 894df4905b3a feat(schema): Consolidate null type handling (#18163)
add 419f232f716a [HUDI-9730] RFC-99 Hudi Type System (#13743)
add 589264c49ed0 fix: flink source v2 serializability (#18165)
add 32440130bcd4 feat: Add metadata record_index lookup command to Hudi
CLI (#17940)
add 5720910a4d61 test: add unit test for multiple partition filters on
same column (#17934)
add cf4a19411b8b feat: add Presto to Hudi Notebooks (#18078)
add 012b572c1b26 [MINOR] Publish HUDI version metrics as integers (#17466)
add 7a88b6bedf66 refactor: Add Lombok annotations to hudi-common module
(part 5) (#17878)
add b584c3cf311d test(concurrency): add tests for write conflicts with
different conflict resolution strategies (#17501)
add ef484d427884 fix: Include metadata file cache size option in the
configuration for HFile reader (#18175)
add 513f62c12a25 fix(spark): Fix TestSparkSchemaUtils failing with Spark
3.3 due to timestamp_ntz (#17917)
add bd52f592570b fix(flink): include exception stacktrace in error logs
(#18091)
add 3d5cf2696280 feat: Publish commits to process metrics for
HoodieStreamer (#17929)
add 9788b8da19d0 fix: Use local engine context for clean planning on
metadata and non-partitioned tables (#17942)
add 19095986ed9e perf(common): Make ThreadLocal variables in
HoodieAvroDataBlock static (#18023)
add 53ee07be3e03 fix(metadata-table): Fix failed deletes when updating MDT
with clean metadata (#18035)
add eeb064217d2f fix unsigned values in proto conversion to be positive
(#18186)
add 762fbea864e9 feat(blob): update approach to remove reliance on column
groups, break down plan (#18013)
add 833ef62055e5 fix: Empty write should not cause spark analysis errors
with pre-commit validators (#18128)
add 1846230d84de fix: throw correct exception when reading
hoodie.properties file without access (#18176)
add 2462ae5ce7cc refactor: Remove redundancy in index validation logic in
HoodieIndexU… (#17911)
add 1577d6eb5834 fix: SimpleAvro-, NonpartitionedAvro- and
ComplexAvroKeyGenerator are also valid for writing by Spark when meta-fields
are disabled (#18187)
add 5180b49a4e77 feat(flink): lookup join with retry and async
capabilities (#18193)
add a7d301a2c0a7 fix: revert (feat: support mini batch split reader)
(#18200)
add bce8d598c4a5 fix(flink): Use blocking instant generation when CDC is
enabled (#18206)
add dfe322094005 refactor: Remove not used classes from
`org.apache.hudi.spark.internal` (#18211)
add 9fa5e8515d5d chore: Add .claude and .codex directories to .gitignore
(#18213)
add eb1d7729a1bb fix(trino): Fix Docker initialization issue in the Trino
plugin (#18220)
add 431b4d8e0e33 docs(spark): Update description of modules related to
integration with Spark (#18219)
add 3bab09a93214 fix: Handle case when 0 byte completed commit files
present in the timeline (#18210)
add 9bb88cf935b0 feat(blob): Blob schema definition (#18108)
add 057af9e0694d chore(ci): Add Codecov coverage report in GitHub actions
(#18230)
add cf1d9a1e223b Support reading and writing of Variant Types - Add
adapter pattern for Spark3 and 4 - Cleanup invariant issue in SparkSqlWriter -
Add cross engine test - Add backward compatibility test for Spark3.x - Add
cross engine read for Flink - Make VariantLogicalType compare against singleton
add d3bb0d3750f1 feat(schema): Add support to write shredded variants
add d1e8c0d46287 feat(schema): Config path implemented for spark record
type
add e851bd1fe2a6 feat(schema): Add read + write support for shredded for
AVRO
This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version. This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:
* -- * -- B -- O -- O -- O (f79a417daee0)
\
N -- N -- N refs/heads/variant-intro-shredded-avro-support
(e851bd1fe2a6)
You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.
Any revisions marked "omit" are not gone; other references still
refer to them. Any revisions marked "discard" are gone forever.
No new revisions were added by this update.
Summary of changes:
.codecov.yml | 2 +-
.github/workflows/bot.yml | 90 +-
.gitignore | 5 +
.../aws/transaction/lock/S3StorageLockClient.java | 5 +-
.../apache/hudi/cli/commands/MetadataCommand.java | 67 +-
.../cli/commands/TestHoodieLogFileCommand.java | 1 -
.../hudi/cli/commands/TestMetadataCommand.java | 160 ++++
.../hudi/client/BaseHoodieTableServiceClient.java | 32 +-
.../apache/hudi/client/BaseHoodieWriteClient.java | 41 +-
.../transaction/lock/StorageBasedLockProvider.java | 6 +-
.../lock/metrics/HoodieLockMetrics.java | 6 +-
.../lock/models/LockProviderHeartbeatManager.java | 27 +-
.../utils/LegacyArchivedMetaEntryReader.java | 8 +-
.../org/apache/hudi/config/HoodieCleanConfig.java | 15 +
.../apache/hudi/config/HoodieCompactionConfig.java | 7 -
.../org/apache/hudi/config/HoodieWriteConfig.java | 48 +-
.../org/apache/hudi/index/HoodieIndexUtils.java | 101 +--
.../apache/hudi/index/bloom/HoodieBloomIndex.java | 6 +
.../index/bloom/HoodieBloomIndexCheckFunction.java | 45 +-
.../bloom/ListBasedHoodieBloomIndexHelper.java | 8 +-
.../java/org/apache/hudi/io/BaseCreateHandle.java | 8 +-
.../hudi/io/FileGroupReaderBasedAppendHandle.java | 2 +-
.../hudi/io/FileGroupReaderBasedMergeHandle.java | 4 +
.../org/apache/hudi/io/HoodieAppendHandle.java | 10 +-
.../org/apache/hudi/io/HoodieKeyLookupHandle.java | 11 +-
.../org/apache/hudi/io/HoodieKeyLookupResult.java | 4 +-
.../org/apache/hudi/io/HoodieWriteMergeHandle.java | 6 +-
.../hudi/keygen/ComplexAvroKeyGenerator.java | 3 +-
.../java/org/apache/hudi/keygen/KeyGenUtils.java | 20 +-
.../apache/hudi/keygen/SimpleAvroKeyGenerator.java | 3 +-
.../metadata/HoodieBackedTableMetadataWriter.java | 89 +-
.../hudi/metadata/HoodieMetadataWriteUtils.java | 39 +-
.../hudi/metadata/HoodieTableMetadataWriter.java | 5 +
.../apache/hudi/metadata/RecordIndexMapper.java | 2 +-
.../SecondaryIndexRecordGenerationUtils.java | 3 +-
.../org/apache/hudi/metrics/HoodieMetrics.java | 9 +
.../java/org/apache/hudi/table/HoodieTable.java | 2 +-
.../table/action/clean/CleanActionExecutor.java | 6 +-
.../action/clean/CleanPlanActionExecutor.java | 12 +-
.../strategy/ClusteringExecutionStrategy.java | 7 +-
.../hudi/table/action/compact/CompactHelpers.java | 15 +-
.../client/transaction/TestHoodieLockMetrics.java | 32 +
.../lock/TestStorageBasedLockProvider.java | 22 +
.../models/TestLockProviderHeartbeatManager.java | 148 +++-
.../client/utils/TestFileSliceMetricUtils.java | 4 +-
.../apache/hudi/index/TestHoodieIndexUtils.java | 73 +-
.../TestHoodieBackedTableMetadataWriter.java | 132 ++-
.../metrics/prometheus/TestPrometheusReporter.java | 10 +-
.../action/clean/TestCleanPlanActionExecutor.java | 33 +
.../GenericRecordValidationTestUtils.java | 2 -
.../hudi/client/HoodieFlinkTableServiceClient.java | 3 +
.../apache/hudi/client/HoodieFlinkWriteClient.java | 8 +
.../hudi/client/model/HoodieFlinkRecord.java | 4 +-
.../io/storage/row/HoodieRowDataCreateHandle.java | 5 +
.../row/parquet/ParquetSchemaConverter.java | 5 +-
.../FlinkHoodieBackedTableMetadataWriter.java | 15 +-
.../org/apache/hudi/table/HoodieFlinkTable.java | 2 +-
.../apache/hudi/util/HoodieSchemaConverter.java | 105 +++
.../index/bloom/TestFlinkHoodieBloomIndex.java | 9 +-
.../row/parquet/TestParquetSchemaConverter.java | 47 +-
.../hudi/util/TestHoodieSchemaConverter.java | 157 ++++
.../apache/hudi/client/HoodieJavaWriteClient.java | 7 +-
.../hudi/client/TestJavaHoodieBackedMetadata.java | 7 +-
.../read/HoodieFileGroupReaderOnJavaTestBase.java | 2 +-
.../apache/hudi/client/SparkRDDWriteClient.java | 26 +
.../MultipleSparkJobExecutionStrategy.java | 3 +-
.../client/common/SparkReaderContextFactory.java | 5 +-
.../client/utils/SparkMetadataWriterUtils.java | 8 +-
.../hudi/client/utils/SparkPartitionUtils.java | 8 +-
.../hudi/client/utils/SparkValidatorUtils.java | 78 +-
.../validator/SqlQueryPreCommitValidator.java | 1 +
.../hudi/common/model/HoodieSparkRecord.java | 4 +-
.../bloom/HoodieBloomFilterProbingResult.java | 4 +-
.../index/bloom/HoodieFileProbingFunction.java | 46 +-
.../HoodieMetadataBloomFilterProbingFunction.java | 4 +-
.../index/bloom/SparkHoodieBloomIndexHelper.java | 12 +-
.../hudi/io/storage/HoodieSparkLanceReader.java | 2 +-
.../hudi/io/storage/HoodieSparkParquetReader.java | 5 +-
.../hudi/io/storage/row/HoodieRowCreateHandle.java | 4 +
.../storage/row/HoodieRowParquetWriteSupport.java | 4 +-
.../apache/hudi/keygen/BuiltinKeyGenerator.java | 6 +-
.../hudi/keygen/PartitionPathFormatterBase.java | 14 +-
.../hudi/keygen/StringPartitionPathFormatter.java | 7 +-
.../keygen/UTF8StringPartitionPathFormatter.java | 7 +-
.../hudi/table/HoodieSparkCopyOnWriteTable.java | 4 +-
.../bootstrap/BaseBootstrapMetadataHandler.java | 4 +-
.../commit/BaseSparkCommitActionExecutor.java | 4 +-
.../scala/org/apache/hudi/HoodieSparkUtils.scala | 33 +-
.../SparkFileFormatInternalRecordContext.scala | 2 -
.../apache/spark/sql/HoodieInternalRowUtils.scala | 37 -
.../sql/avro/HoodieSparkSchemaConverters.scala | 49 +-
.../datasources/SparkSchemaTransformUtils.scala | 436 +++++++++
.../parquet/HoodieParquetFileFormatHelper.scala | 178 +---
.../parquet/SparkBasicSchemaEvolution.scala | 5 +-
.../hudi/MultipleColumnarFileFormatReader.scala | 12 +-
.../org/apache/spark/sql/hudi/SparkAdapter.scala | 30 +-
.../callback/TestHoodieClientInitCallback.java | 14 +
.../org/apache/hudi/client/TestClientRollback.java | 49 +-
.../client/TestSparkRDDMetadataWriteClient.java | 1 -
.../hudi/client/TestSparkRDDWriteClient.java | 87 ++
...parkConsistentBucketClusteringPlanStrategy.java | 4 +-
.../TestSparkSizeBasedClusteringPlanStrategy.java | 2 +-
.../functional/TestHoodieBackedTableMetadata.java | 1 -
.../hudi/client/utils/TestSparkValidatorUtils.java | 98 ++
.../hudi/index/bloom/TestHoodieBloomIndex.java | 70 +-
.../hudi/keygen/TestComplexKeyGenerator.java | 40 +
.../apache/hudi/keygen/TestCustomKeyGenerator.java | 44 +
.../apache/hudi/keygen/TestSimpleKeyGenerator.java | 45 +
.../keygen/TestTimestampBasedKeyGenerator.java | 6 +-
...TestSparkBuildClusteringGroupsForPartition.java | 2 +-
.../TestSparkSchemaTransformUtils.scala | 419 +++++++++
.../org/apache/hudi/BaseHoodieTableFileIndex.java | 2 +-
.../main/java/org/apache/hudi/HoodieVersion.java | 136 +++
.../AvroSchemaComparatorForRecordProjection.java | 80 --
.../AvroSchemaComparatorForSchemaEvolution.java | 357 --------
.../java/org/apache/hudi/avro/AvroSchemaUtils.java | 559 ------------
.../java/org/apache/hudi/avro/HoodieAvroUtils.java | 281 +-----
.../index/hfile/HFileBootstrapIndexReader.java | 4 +-
.../hudi/common/config/HoodieIndexingConfig.java | 13 +
.../hudi/common/config/HoodieMetadataConfig.java | 38 +
.../hudi/common/config/HoodieReaderConfig.java | 8 -
.../apache/hudi/common/config/RecordMergeMode.java | 2 +-
.../hudi/common/data/HoodieListPairData.java | 6 +-
.../hudi/common/engine/HoodieReaderContext.java | 2 +-
.../java/org/apache/hudi/common/fs/FSUtils.java | 11 +
.../apache/hudi/common/model/BaseAvroPayload.java | 6 +-
.../org/apache/hudi/common/model/BaseFile.java | 80 +-
.../hudi/common/model/BootstrapFileMapping.java | 71 +-
.../hudi/common/model/BootstrapIndexType.java | 15 +-
.../apache/hudi/common/model/CleanFileInfo.java | 38 +-
.../hudi/common/model/ClusteringGroupInfo.java | 35 +-
.../hudi/common/model/ClusteringOperation.java | 93 +-
.../hudi/common/model/CompactionOperation.java | 84 +-
.../hudi/common/model/ConsistentHashingNode.java | 23 +-
.../org/apache/hudi/common/model/DeleteRecord.java | 55 +-
.../common/model/EmptyHoodieRecordPayload.java | 5 +-
.../org/apache/hudi/common/model/FileSlice.java | 59 +-
.../hudi/common/model/HoodieAvroIndexedRecord.java | 7 +-
.../hudi/common/model/HoodieAvroPayload.java | 20 +-
.../apache/hudi/common/model/HoodieAvroRecord.java | 13 +-
.../apache/hudi/common/model/HoodieBaseFile.java | 24 +-
.../hudi/common/model/HoodieCommitMetadata.java | 109 +--
.../model/HoodieConsistentHashingMetadata.java | 46 +-
.../hudi/common/model/HoodieDeltaWriteStat.java | 36 +-
.../hudi/common/model/HoodieEmptyRecord.java | 3 +-
.../apache/hudi/common/model/HoodieFileGroup.java | 46 +-
.../hudi/common/model/HoodieFileGroupId.java | 32 +-
.../hudi/common/model/HoodieIndexMetadata.java | 14 +-
.../org/apache/hudi/common/model/HoodieKey.java | 57 +-
.../common/model/HoodieLSMTimelineManifest.java | 34 +-
.../apache/hudi/common/model/HoodieLogFile.java | 65 +-
.../hudi/common/model/HoodieMetadataWrapper.java | 13 +-
.../apache/hudi/common/model/HoodieOperation.java | 19 +-
.../hudi/common/model/HoodiePartitionMetadata.java | 13 +-
.../org/apache/hudi/common/model/HoodieRecord.java | 31 +-
.../model/HoodieRecordCompatibilityInterface.java | 4 +-
.../hudi/common/model/HoodieRecordDelegate.java | 30 +-
.../common/model/HoodieRecordGlobalLocation.java | 50 +-
.../hudi/common/model/HoodieRecordLocation.java | 67 +-
.../common/model/HoodieReplaceCommitMetadata.java | 62 +-
.../hudi/common/model/HoodieRollingStat.java | 62 +-
.../common/model/HoodieRollingStatMetadata.java | 20 +-
.../hudi/common/model/HoodieTimelineTimeZone.java | 19 +-
.../apache/hudi/common/model/HoodieWriteStat.java | 196 +---
.../apache/hudi/common/model/MetadataValues.java | 32 +-
.../common/model/PartialUpdateAvroPayload.java | 8 +-
.../model/PartitionBucketIndexHashingConfig.java | 52 +-
.../debezium/AbstractDebeziumAvroPayload.java | 6 +-
.../model/debezium/MySqlDebeziumAvroPayload.java | 6 +-
.../debezium/PostgresDebeziumAvroPayload.java | 5 +-
.../apache/hudi/common/schema/HoodieSchema.java | 258 +++++-
.../HoodieSchemaComparatorForRecordProjection.java | 119 +++
.../HoodieSchemaComparatorForSchemaEvolution.java | 7 +-
.../common/schema/HoodieSchemaCompatibility.java | 188 +++-
.../schema/HoodieSchemaCompatibilityChecker.java} | 635 ++++---------
.../hudi/common/schema/HoodieSchemaField.java | 32 +-
.../schema/HoodieSchemaProjectionChecker.java | 234 +++++
.../hudi/common/schema/HoodieSchemaType.java | 8 +
.../common/schema/HoodieSchemaTypePromotion.java | 137 +++
.../hudi/common/schema/HoodieSchemaUtils.java | 248 ++++--
.../hudi/common/table/HoodieTableConfig.java | 5 +
.../hudi/common/table/HoodieTableMetaClient.java | 12 +
.../hudi/common/table/cdc/HoodieCDCExtractor.java | 5 +-
.../hudi/common/table/cdc/HoodieCDCFileSplit.java | 13 +-
.../hudi/common/table/cdc/HoodieCDCUtils.java | 5 +-
.../table/log/AbstractHoodieLogRecordScanner.java | 168 +---
.../table/log/BaseHoodieLogRecordReader.java | 186 +---
.../table/log/HoodieLogBlockMetadataScanner.java | 8 +-
.../table/log/HoodieMergedLogRecordReader.java | 12 +-
.../table/log/HoodieMergedLogRecordScanner.java | 15 +-
.../table/log/HoodieUnMergedLogRecordScanner.java | 14 +-
.../table/log/block/HoodieAvroDataBlock.java | 36 +-
.../table/log/block/HoodieHFileDataBlock.java | 24 +-
.../common/table/read/HoodieFileGroupReader.java | 18 +-
.../hudi/common/table/read/HoodieReadStats.java | 88 +-
.../table/read/IncrementalQueryAnalyzer.java | 10 +-
.../hudi/common/table/read/ReaderParameters.java | 16 +-
.../read/buffer/LogScanningRecordBufferLoader.java | 1 -
.../hudi/common/table/timeline/TimelineUtils.java | 6 +
.../common/table/timeline/dto/BaseFileDTO.java | 6 +-
.../hudi/common/table/timeline/dto/LogFileDTO.java | 2 +-
.../hudi/common/util/BufferedRandomAccessFile.java | 5 +-
.../org/apache/hudi/common/util/CleanerUtils.java | 9 +-
.../apache/hudi/common/util/ClusteringUtils.java | 8 +-
.../org/apache/hudi/common/util/CommitUtils.java | 10 +-
.../org/apache/hudi/common/util/ConfigUtils.java | 24 +-
.../org/apache/hudi/common/util/DateTimeUtils.java | 12 +-
.../org/apache/hudi/common/util/HFileUtils.java | 12 +-
.../common/util/HoodieRecordSizeEstimator.java | 14 +-
.../hudi/common/util/InternalSchemaCache.java | 8 +-
.../org/apache/hudi/common/util/LanceUtils.java | 39 +-
.../org/apache/hudi/common/util/MarkerUtils.java | 10 +-
.../org/apache/hudi/common/util/RateLimiter.java | 10 +-
.../hudi/common/util/RemotePartitionHelper.java | 7 +-
.../org/apache/hudi/common/util/RetryHelper.java | 14 +-
.../hudi/common/util/RocksDBSchemaHelper.java | 35 +-
.../apache/hudi/common/util/SpillableMapUtils.java | 9 +-
.../apache/hudi/common/util/TablePathUtils.java | 8 +-
.../org/apache/hudi/common/util/TypeUtils.java | 7 +-
.../common/util/collection/BitCaskDiskMap.java | 47 +-
.../hudi/common/util/collection/DiskMap.java | 10 +-
.../util/collection/ExternalSpillableMap.java | 22 +-
.../hudi/common/util/collection/FlatLists.java | 5 +-
.../hudi/common/util/collection/ImmutablePair.java | 22 +-
.../common/util/collection/ImmutableTriple.java | 29 +-
.../common/util/collection/LazyFileIterable.java | 7 +-
.../hudi/common/util/collection/RocksDBDAO.java | 52 +-
.../common/util/collection/RocksDbDiskMap.java | 8 +-
.../common/util/queue/BoundedInMemoryExecutor.java | 12 +-
.../common/util/queue/BoundedInMemoryQueue.java | 10 +-
.../common/util/queue/DisruptorMessageQueue.java | 12 +-
.../util/queue/FunctionBasedQueueProducer.java | 10 +-
.../util/queue/IteratorBasedQueueProducer.java | 10 +-
.../hudi/common/util/queue/SimpleExecutor.java | 10 +-
.../exception/MissingSchemaFieldException.java | 11 +-
.../SchemaBackwardsCompatibilityException.java | 13 +-
.../apache/hudi/io/storage/HFileReaderFactory.java | 2 +-
.../io/storage/HoodieNativeAvroHFileReader.java | 3 +-
.../org/apache/hudi/keygen/BaseKeyGenerator.java | 3 +
.../hudi/keygen/constant/KeyGeneratorOptions.java | 8 +
.../hudi/keygen/constant/KeyGeneratorType.java | 9 +
.../metadata/FileSystemBackedTableMetadata.java | 17 +-
.../hudi/metadata/HoodieBackedTableMetadata.java | 3 +-
.../hudi/metadata/HoodieMetadataMetrics.java | 3 +
.../hudi/metadata/HoodieTableMetadataUtil.java | 50 +-
.../java/org/apache/hudi/stats/ValueMetadata.java | 3 +-
.../main/java/org/apache/hudi/stats/ValueType.java | 3 +-
.../src/main/java/org/apache/hudi/util/Lazy.java | 3 +
.../apache/parquet/schema/AvroSchemaRepair.java | 238 -----
.../apache/parquet/schema/HoodieSchemaRepair.java | 245 +++++
.../java/org/apache/hudi/TestHoodieVersion.java | 67 ++
.../org/apache/hudi/avro/AvroSchemaTestUtils.java | 71 --
...TestAvroSchemaComparatorForSchemaEvolution.java | 499 -----------
.../org/apache/hudi/avro/TestAvroSchemaUtils.java | 810 -----------------
.../org/apache/hudi/avro/TestHoodieAvroUtils.java | 383 +-------
.../common/config/TestHoodieMetadataConfig.java | 31 +
.../common/model/TestHoodieCommitMetadata.java | 36 +-
.../hudi/common/schema/TestHoodieSchema.java | 263 ++++++
...stHoodieSchemaComparatorForSchemaEvolution.java | 109 +++
.../schema/TestHoodieSchemaCompatibility.java | 112 ++-
.../hudi/common/schema/TestHoodieSchemaField.java | 12 +
.../hudi/common/schema/TestHoodieSchemaType.java | 5 +
.../hudi/common/schema/TestHoodieSchemaUtils.java | 309 +------
.../table/read/TestHoodieFileGroupReaderBase.java | 229 ++++-
.../common/testutils/HoodieTestDataGenerator.java | 24 +-
.../apache/hudi/common/util/TestConfigUtils.java | 22 +
.../hudi/io/storage/TestHFileReaderFactory.java | 2 +-
.../parquet/schema/TestAvroSchemaRepair.java | 983 ---------------------
.../parquet/schema/TestHoodieSchemaRepair.java | 900 +++++++++++++++++++
.../apache/hudi/configuration/FlinkOptions.java | 53 +-
.../apache/hudi/configuration/OptionsResolver.java | 29 +-
.../hudi/metrics/FlinkStreamReadMetrics.java | 4 +-
.../java/org/apache/hudi/sink/CleanFunction.java | 61 +-
.../hudi/sink/StreamWriteOperatorCoordinator.java | 100 ++-
.../hudi/sink/clustering/ClusteringCommitSink.java | 2 +-
.../apache/hudi/sink/compact/CompactOperator.java | 133 +--
.../hudi/sink/compact/CompactionCommitEvent.java | 14 +-
.../hudi/sink/compact/CompactionCommitSink.java | 152 +---
.../hudi/sink/compact/CompactionPlanEvent.java | 17 +-
.../hudi/sink/compact/CompactionPlanOperator.java | 124 +--
.../hudi/sink/compact/FlinkCompactionConfig.java | 2 +-
.../hudi/sink/compact/handler/CleanHandler.java | 118 +++
.../sink/compact/handler/CompactCommitHandler.java | 257 ++++++
.../hudi/sink/compact/handler/CompactHandler.java | 201 +++++
.../compact/handler/CompactionPlanHandler.java | 220 +++++
.../handler/MetadataCompactCommitHandler.java | 138 +++
.../compact/handler/MetadataCompactHandler.java | 114 +++
.../handler/MetadataCompactionPlanHandler.java | 179 ++++
.../sink/partitioner/BucketAssignFunction.java | 2 +-
.../hudi/sink/partitioner/index/IndexBackend.java | 5 +-
.../sink/partitioner/index/RecordIndexCache.java | 111 ++-
.../partitioner/index/RecordLevelIndexBackend.java | 11 +-
.../org/apache/hudi/sink/v2/utils/PipelinesV2.java | 2 +-
.../org/apache/hudi/source/HoodieScanContext.java | 68 ++
.../java/org/apache/hudi/source/HoodieSource.java | 108 ++-
.../apache/hudi/source/IncrementalInputSplits.java | 14 +
.../java/org/apache/hudi/source/ScanContext.java | 227 -----
.../hudi/source/StreamReadMonitoringFunction.java | 9 +-
.../org/apache/hudi/source/StreamReadOperator.java | 2 +-
.../source/assign/DefaultHoodieSplitAssigner.java | 50 ++
.../hudi/source/assign/HoodieSplitAssigner.java | 20 +-
.../hudi/source/assign/HoodieSplitAssigners.java | 34 +-
.../source/assign/HoodieSplitBucketAssigner.java | 49 +
.../HoodieSplitNumberAssigner.java} | 29 +-
.../enumerator/AbstractHoodieSplitEnumerator.java | 12 +-
.../HoodieContinuousSplitEnumerator.java | 18 +-
.../enumerator/HoodieEnumeratorPosition.java | 43 +-
.../HoodieEnumeratorStateSerializer.java | 4 +-
.../enumerator/HoodieStaticSplitEnumerator.java | 4 +-
.../{HoodieBatchRecords.java => BatchRecords.java} | 30 +-
.../source/reader/DefaultHoodieBatchReader.java | 112 ---
.../hudi/source/reader/HoodieBatchReader.java | 44 -
.../hudi/source/reader/HoodieRecordEmitter.java | 4 +-
.../hudi/source/reader/HoodieSourceReader.java | 4 +-
.../source/reader/HoodieSourceSplitReader.java | 53 +-
.../reader/function/HoodieSplitReaderFunction.java | 36 +-
.../reader/function/SplitReaderFunction.java | 3 +-
.../source/split/DefaultHoodieSplitDiscover.java | 18 +-
.../source/split/DefaultHoodieSplitProvider.java | 72 +-
.../source/split/HoodieContinuousSplitBatch.java | 36 +-
.../hudi/source/split/HoodieSourceSplit.java | 42 +-
.../source/split/HoodieSourceSplitComparator.java | 57 ++
.../source/split/HoodieSourceSplitSerializer.java | 5 +
.../hudi/source/split/HoodieSourceSplitState.java | 18 +-
.../hudi/source/split/HoodieSplitProvider.java | 6 +-
.../hudi/source/split/SerializableComparator.java | 9 +-
.../org/apache/hudi/table/HoodieTableSink.java | 4 +-
.../org/apache/hudi/table/HoodieTableSource.java | 256 +++---
.../org/apache/hudi/table/format/FormatUtils.java | 1 -
.../table/format/mor/MergeOnReadInputFormat.java | 7 +
.../table/lookup/AsyncLookupFunctionWrapper.java | 87 ++
.../hudi/table/lookup/HoodieLookupFunction.java | 34 +-
.../table/lookup/LookupRuntimeProviderFactory.java | 19 +-
.../java/org/apache/hudi/util/CompactionUtil.java | 76 ++
.../java/org/apache/hudi/util/FileIndexReader.java | 240 +++++
.../org/apache/hudi/util/FlinkWriteClients.java | 3 +-
.../java/org/apache/hudi/util/StreamerUtil.java | 15 +
.../apache/hudi/sink/ITTestDataStreamWrite.java | 4 +-
.../org/apache/hudi/sink/TestWriteCopyOnWrite.java | 51 +-
.../org/apache/hudi/sink/TestWriteMergeOnRead.java | 75 ++
.../partitioner/index/TestRecordIndexCache.java | 116 ++-
.../index/TestRecordLevelIndexBackend.java | 80 +-
.../hudi/sink/utils/CompactFunctionWrapper.java | 8 +-
.../sink/utils/StreamWriteFunctionWrapper.java | 2 +-
.../org/apache/hudi/sink/utils/TestWriteBase.java | 2 +-
.../apache/hudi/source/TestHoodieScanContext.java | 718 +++++++++++++++
.../org/apache/hudi/source/TestHoodieSource.java | 718 +++++++--------
.../hudi/source/TestIncrementalInputSplits.java | 115 +++
.../org/apache/hudi/source/TestScanContext.java | 406 ---------
.../assign/TestDefaultHoodieSplitAssigner.java | 242 +++++
.../source/assign/TestHoodieSplitAssigners.java | 236 +++++
.../assign/TestHoodieSplitBucketAssigner.java | 293 ++++++
.../assign/TestHoodieSplitNumberAssigner.java | 191 ++++
.../TestHoodieContinuousSplitEnumerator.java | 101 +--
.../TestHoodieEnumeratorStateSerializer.java | 41 +-
.../TestHoodieStaticSplitEnumerator.java | 15 +-
.../apache/hudi/source/reader/TestBatchReader.java | 334 -------
.../hudi/source/reader/TestBatchRecords.java | 146 ++-
.../hudi/source/reader/TestDefaultBatchReader.java | 511 -----------
.../source/reader/TestHoodieRecordEmitter.java | 1 +
.../source/reader/TestHoodieSourceSplitReader.java | 353 +++-----
.../function/TestHoodieSplitReaderFunction.java | 97 +-
.../split/TestDefaultHoodieSplitDiscover.java | 28 +-
.../split/TestDefaultHoodieSplitProvider.java | 533 ++++++++++-
.../hudi/source/split/TestHoodieSourceSplit.java | 29 +-
.../split/TestHoodieSourceSplitComparator.java | 273 ++++++
.../split/TestHoodieSourceSplitSerializer.java | 43 +-
.../apache/hudi/table/ITTestHoodieDataSource.java | 65 +-
.../apache/hudi/table/ITTestSchemaEvolution.java | 24 +-
.../table/TestHoodieFileGroupReaderOnFlink.java | 2 +-
.../apache/hudi/table/TestHoodieTableSource.java | 95 +-
.../org/apache/hudi/util/TestFileIndexReader.java | 362 ++++++++
.../org/apache/hudi/utils/TestCompactionUtil.java | 128 ++-
.../test/java/org/apache/hudi/utils/TestData.java | 21 +-
.../test/java/org/apache/hudi/utils/TestUtils.java | 22 +-
.../gcp/transaction/lock/GCSStorageLockClient.java | 14 +-
.../apache/hudi/avro/HoodieAvroWriteSupport.java | 43 +-
.../hadoop/HoodieAvroFileReaderFactory.java | 6 +-
.../io/storage/hadoop/HoodieAvroParquetReader.java | 18 +-
...TestHoodieAvroWriteSupportVariantShredding.java | 508 -----------
.../common/bootstrap/index/TestBootstrapIndex.java | 2 +-
.../common/functional/TestHoodieLogFormat.java | 124 +--
.../hudi/common/table/TestHoodieTableConfig.java | 2 +-
.../hudi/common/table/TestTimelineUtils.java | 40 +
.../table/view/TestHoodieTableFileSystemView.java | 2 +-
.../hudi/common/testutils/HoodieTestTable.java | 10 +
.../common/util/collection/TestBitCaskDiskMap.java | 10 +-
.../util/collection/TestExternalSpillableMap.java | 18 +-
...oodieAvroFileWriterFactoryVariantShredding.java | 275 ------
.../hudi/metadata/TestHoodieTableMetadataUtil.java | 2 +-
.../schema/TestSchemaRepairEquivalence.java | 524 +++++------
.../hudi/hadoop/HiveHoodieReaderContext.java | 11 +-
.../HoodieFileGroupReaderBasedRecordReader.java | 3 +-
.../org/apache/hudi/hadoop/HoodieHiveRecord.java | 4 +-
.../hudi/hadoop/HoodieROTablePathFilter.java | 30 +-
.../apache/hudi/hadoop/SchemaEvolutionContext.java | 15 +-
.../hudi/hadoop/avro/HoodieAvroParquetReader.java | 5 +-
.../HoodieTimestampAwareParquetInputFormat.java | 11 +-
.../realtime/RealtimeCompactedRecordReader.java | 3 -
.../utils/HoodieArrayWritableSchemaUtils.java | 7 +-
.../apache/hudi/hadoop/TestHoodieHiveRecord.java | 12 +-
.../hudi/hadoop/TestHoodieROTablePathFilter.java | 189 ++++
.../hive/TestHoodieCombineHiveInputFormat.java | 15 +-
.../realtime/TestHoodieRealtimeRecordReader.java | 2 +-
.../hudi/hadoop/testutils/InputFormatTestUtil.java | 25 +-
.../testsuite/HoodieInlineTestSuiteWriter.java | 5 +-
.../testsuite/HoodieMultiWriterTestSuiteJob.java | 2 +-
.../hudi/integ/testsuite/HoodieTestSuiteJob.java | 8 +-
.../SparkDataSourceContinuousIngestTool.java | 3 +-
.../reader/DFSHoodieDatasetInputReader.java | 10 +-
.../testsuite/converter/TestDeleteConverter.java | 2 +-
.../testsuite/converter/TestUpdateConverter.java | 2 +-
.../reader/TestDFSHoodieDatasetInputReader.java | 6 +-
.../hudi/integ/testsuite/utils/TestUtils.java | 8 +-
.../hudi/common/util/HoodieExceptionUtil.java | 29 +-
.../connect/writers/BufferedConnectWriter.java | 4 +-
.../Dockerfile.presto | 8 +-
hudi-notebooks/build.sh | 10 +
.../conf/{trino => presto}/catalog/hive.properties | 7 +-
.../conf/{trino => presto}/catalog/hudi.properties | 11 +-
hudi-notebooks/docker-compose.yml | 11 +
hudi-spark-datasource/README.md | 83 +-
.../apache/hudi/internal/BaseDefaultSource.java | 49 -
.../hudi/internal/BaseWriterCommitMessage.java | 39 -
.../internal/DataSourceInternalWriterHelper.java | 111 ---
.../apache/hudi/spark/internal/DefaultSource.java | 66 --
.../HoodieBulkInsertDataInternalWriter.java | 79 --
.../HoodieBulkInsertDataInternalWriterFactory.java | 57 --
.../HoodieDataSourceInternalBatchWrite.java | 97 --
.../HoodieDataSourceInternalBatchWriteBuilder.java | 64 --
.../internal/HoodieDataSourceInternalTable.java | 87 --
.../spark/internal/HoodieWriterCommitMessage.java | 37 -
.../scala/org/apache/hudi/DataSourceOptions.scala | 23 +
.../org/apache/hudi/HoodieCreateRecordUtils.scala | 2 +-
.../hudi/HoodieHadoopFsRelationFactory.scala | 5 +-
.../org/apache/hudi/HoodieMergeOnReadRDDV2.scala | 11 +-
.../scala/org/apache/hudi/HoodieSchemaUtils.scala | 1 -
.../org/apache/hudi/HoodieSparkSqlWriter.scala | 5 +-
.../apache/hudi/MergeOnReadSnapshotRelation.scala | 2 +-
.../apache/hudi/SparkHoodieTableFileIndex.scala | 4 +-
.../org/apache/hudi/cdc/CDCFileGroupIterator.scala | 3 -
.../sql/catalyst/catalog/HoodieCatalogTable.scala | 16 +-
.../datasources/lance/SparkLanceReaderBase.scala | 83 +-
.../HoodieFileGroupReaderBasedFileFormat.scala | 13 +-
.../parquet/ParquetSchemaEvolutionUtils.scala | 3 +-
.../spark/sql/hudi/HoodieSqlCommonUtils.scala | 13 +
.../hudi/command/CreateHoodieTableCommand.scala | 2 +
.../spark/sql/hudi/command/SqlKeyGenerator.scala | 8 +-
.../hudi/command/payload/ExpressionPayload.scala | 9 +-
.../org/apache/hudi/TestHoodieSchemaUtils.java | 3 +-
.../procedures/PartitionBucketIndexManager.scala | 7 +-
.../TestHoodieClientOnCopyOnWriteStorage.java | 176 ++++
.../TestHoodieClientOnMergeOnReadStorage.java | 1 -
.../TestMetadataUtilRLIandSIRecordGeneration.java | 5 +-
.../hudi/functional/TestHoodieBackedMetadata.java | 72 +-
.../hudi/functional/TestHoodieFileSystemViews.java | 2 +-
.../functional/TestSparkSortAndSizeClustering.java | 4 +-
.../java/org/apache/hudi/io/TestMergeHandle.java | 2 +-
.../TestHoodieBulkInsertDataInternalWriter.java | 166 ----
.../TestHoodieDataSourceInternalBatchWrite.java | 346 --------
.../TestCopyOnWriteRollbackActionExecutor.java | 112 ++-
.../hudi/TestPartitionDirectoryConverter.scala | 2 +-
.../read/TestHoodieFileGroupReaderOnSpark.scala | 29 +-
.../apache/hudi/functional/TestCOWDataSource.scala | 2 +-
.../hudi/functional/TestLanceDataSource.scala | 118 ++-
.../spark/sql/avro/TestSchemaConverters.scala | 162 +++-
.../TestHoodiePruneFileSourcePartitions.scala | 136 ++-
.../common/TestSlashSeparatedPartitionValue.scala | 161 ++++
.../sql/hudi/dml/insert/TestInsertTable.scala | 104 ++-
.../sql/hudi/dml/schema/TestVariantDataType.scala | 300 +------
.../spark/sql/adapter/BaseSpark3Adapter.scala | 8 -
.../apache/spark/sql/adapter/Spark3_3Adapter.scala | 4 +-
.../apache/spark/sql/adapter/Spark3_4Adapter.scala | 4 +-
.../apache/spark/sql/adapter/Spark3_5Adapter.scala | 4 +-
.../spark/sql/adapter/BaseSpark4Adapter.scala | 37 +-
.../TestSpark4VariantShreddingProvider.java | 279 ------
.../apache/spark/sql/adapter/Spark4_0Adapter.scala | 4 +-
.../apache/spark/sql/avro/AvroDeserializer.scala | 23 +-
.../org/apache/hudi/hive/TestSparkSchemaUtils.java | 76 +-
.../apache/hudi/sync/common/HoodieSyncConfig.java | 4 +
.../hudi/sync/common/util/SparkSchemaUtils.java | 14 +-
hudi-utilities/pom.xml | 6 +
.../org/apache/hudi/utilities/HoodieCleaner.java | 5 +-
.../apache/hudi/utilities/HoodieClusteringJob.java | 5 +-
.../hudi/utilities/HoodieCompactionAdminTool.java | 5 +-
.../org/apache/hudi/utilities/HoodieCompactor.java | 5 +-
.../hudi/utilities/HoodieDataTableValidator.java | 12 +-
.../hudi/utilities/HoodieDropPartitionsTool.java | 12 +-
.../org/apache/hudi/utilities/HoodieIndexer.java | 5 +-
.../utilities/HoodieMetadataTableValidator.java | 11 +-
.../apache/hudi/utilities/HoodieRepairTool.java | 5 +-
.../hudi/utilities/HoodieSnapshotExporter.java | 9 +-
.../org/apache/hudi/utilities/HoodieTTLJob.java | 5 +-
.../hudi/utilities/HoodieWithTimelineServer.java | 4 +-
.../org/apache/hudi/utilities/TableSizeStats.java | 12 +-
.../org/apache/hudi/utilities/UtilHelpers.java | 30 +-
.../ingestion/HoodieIngestionMetrics.java | 2 +
.../multitable/HoodieMultiTableServicesMain.java | 5 +-
.../hudi/utilities/perf/TimelineServerPerf.java | 5 +-
.../utilities/schema/KafkaOffsetPostProcessor.java | 2 +-
.../hudi/utilities/sources/AvroKafkaSource.java | 7 +-
.../sources/GcsEventsHoodieIncrSource.java | 20 +-
.../hudi/utilities/sources/GcsEventsSource.java | 26 +-
.../hudi/utilities/sources/HiveIncrPullSource.java | 10 +-
.../hudi/utilities/sources/HoodieIncrSource.java | 29 +-
.../apache/hudi/utilities/sources/InputBatch.java | 20 +-
.../apache/hudi/utilities/sources/JdbcSource.java | 29 +-
.../hudi/utilities/sources/JsonKafkaSource.java | 7 +-
.../apache/hudi/utilities/sources/KafkaSource.java | 14 +-
.../hudi/utilities/sources/ProtoKafkaSource.java | 4 +-
.../hudi/utilities/sources/PulsarSource.java | 14 +-
.../sources/S3EventsHoodieIncrSource.java | 11 +-
.../sources/SnapshotLoadQuerySplitter.java | 28 +-
.../org/apache/hudi/utilities/sources/Source.java | 30 +-
.../hudi/utilities/sources/SqlFileBasedSource.java | 7 +-
.../apache/hudi/utilities/sources/SqlSource.java | 8 +-
.../utilities/sources/debezium/DebeziumSource.java | 17 +-
.../sources/helpers/CloudDataFetcher.java | 20 +-
.../sources/helpers/CloudObjectIncrCheckpoint.java | 18 +-
.../sources/helpers/CloudObjectMetadata.java | 19 +-
.../helpers/CloudObjectsSelectorCommon.java | 22 +-
.../sources/helpers/DatePartitionPathSelector.java | 23 +-
.../helpers/IncrSourceCloudStorageHelper.java | 8 +-
.../sources/helpers/IncrSourceHelper.java | 51 +-
.../utilities/sources/helpers/KafkaOffsetGen.java | 56 +-
.../sources/helpers/ProtoConversionUtil.java | 40 +-
.../hudi/utilities/sources/helpers/QueryInfo.java | 53 +-
.../utilities/sources/helpers/QueryRunner.java | 13 +-
.../sources/helpers/gcs/MessageBatch.java | 14 +-
.../sources/helpers/gcs/MessageValidity.java | 15 +-
.../sources/helpers/gcs/MetadataMessage.java | 15 +-
.../sources/helpers/gcs/PubsubMessagesFetcher.java | 12 +-
.../streamer/HoodieMultiTableStreamer.java | 5 +-
.../hudi/utilities/streamer/HoodieStreamer.java | 10 +-
.../utilities/streamer/HoodieStreamerMetrics.java | 10 +-
.../apache/hudi/utilities/streamer/StreamSync.java | 10 +-
.../deltastreamer/TestHoodieDeltaStreamer.java | 3 -
...TestHoodieDeltaStreamerSchemaEvolutionBase.java | 1 +
.../TestHoodieDeltaStreamerWithMultiWriter.java | 1 +
.../deltastreamer/TestSourceFormatAdapter.java | 3 +-
.../utilities/sources/TestHoodieIncrSource.java | 226 +++--
.../sources/helpers/TestIncrSourceHelper.java | 4 +-
.../sources/helpers/TestProtoConversionUtil.java | 22 +-
.../streamer/TestHoodieStreamerUtils.java | 46 +
.../testutils/ColStatsUpgradeTesting.java | 8 +-
.../utilities/testutils/UtilitiesTestBase.java | 3 +-
.../testutils/sources/AbstractBaseTestSource.java | 2 -
pom.xml | 10 +-
rfc/rfc-100/rfc-100.md | 238 ++---
rfc/rfc-99/rfc-99.md | 212 +++++
scripts/jacoco/generate_merged_coverage_report.sh | 46 +
551 files changed, 17340 insertions(+), 15696 deletions(-)
create mode 100644
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/execution/datasources/SparkSchemaTransformUtils.scala
create mode 100644
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/utils/TestSparkValidatorUtils.java
create mode 100644
hudi-client/hudi-spark-client/src/test/scala/org/apache/spark/sql/execution/datasources/TestSparkSchemaTransformUtils.scala
create mode 100644 hudi-common/src/main/java/org/apache/hudi/HoodieVersion.java
delete mode 100644
hudi-common/src/main/java/org/apache/hudi/avro/AvroSchemaComparatorForRecordProjection.java
delete mode 100644
hudi-common/src/main/java/org/apache/hudi/avro/AvroSchemaComparatorForSchemaEvolution.java
create mode 100644
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForRecordProjection.java
rename
hudi-common/src/main/java/org/apache/hudi/{avro/AvroSchemaCompatibility.java =>
common/schema/HoodieSchemaCompatibilityChecker.java} (58%)
create mode 100644
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaProjectionChecker.java
create mode 100644
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaTypePromotion.java
delete mode 100644
hudi-common/src/main/java/org/apache/parquet/schema/AvroSchemaRepair.java
create mode 100644
hudi-common/src/main/java/org/apache/parquet/schema/HoodieSchemaRepair.java
create mode 100644
hudi-common/src/test/java/org/apache/hudi/TestHoodieVersion.java
delete mode 100644
hudi-common/src/test/java/org/apache/hudi/avro/AvroSchemaTestUtils.java
delete mode 100644
hudi-common/src/test/java/org/apache/hudi/avro/TestAvroSchemaComparatorForSchemaEvolution.java
delete mode 100644
hudi-common/src/test/java/org/apache/parquet/schema/TestAvroSchemaRepair.java
create mode 100644
hudi-common/src/test/java/org/apache/parquet/schema/TestHoodieSchemaRepair.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CleanHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactCommitHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionPlanHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/MetadataCompactCommitHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/MetadataCompactHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/MetadataCompactionPlanHandler.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/HoodieScanContext.java
delete mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/ScanContext.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/assign/DefaultHoodieSplitAssigner.java
copy
hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/replication/HiveSyncGlobalCommit.java
=>
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/assign/HoodieSplitAssigner.java
(68%)
copy
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/CoalescingPartitioner.java
=>
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/assign/HoodieSplitAssigners.java
(56%)
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/assign/HoodieSplitBucketAssigner.java
copy
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/{enumerator/HoodieStaticSplitEnumerator.java
=> assign/HoodieSplitNumberAssigner.java} (52%)
rename
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/{HoodieBatchRecords.java
=> BatchRecords.java} (78%)
delete mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/DefaultHoodieBatchReader.java
delete mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/HoodieBatchReader.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/split/HoodieSourceSplitComparator.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/AsyncLookupFunctionWrapper.java
copy
hudi-common/src/main/java/org/apache/hudi/avro/processors/EnumTypeProcessor.java
=>
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/lookup/LookupRuntimeProviderFactory.java
(54%)
create mode 100644
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/FileIndexReader.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/TestHoodieScanContext.java
delete mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/TestScanContext.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/assign/TestDefaultHoodieSplitAssigner.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/assign/TestHoodieSplitAssigners.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/assign/TestHoodieSplitBucketAssigner.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/assign/TestHoodieSplitNumberAssigner.java
delete mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/TestBatchReader.java
delete mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/TestDefaultBatchReader.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/split/TestHoodieSourceSplitComparator.java
create mode 100644
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/util/TestFileIndexReader.java
delete mode 100644
hudi-hadoop-common/src/test/java/org/apache/hudi/avro/TestHoodieAvroWriteSupportVariantShredding.java
delete mode 100644
hudi-hadoop-common/src/test/java/org/apache/hudi/io/storage/hadoop/TestHoodieAvroFileWriterFactoryVariantShredding.java
copy hudi-common/src/main/java/org/apache/hudi/common/util/ThreadUtils.java =>
hudi-io/src/main/java/org/apache/hudi/common/util/HoodieExceptionUtil.java (54%)
copy docker/hoodie/hadoop/prestobase/etc/catalog/localfile.properties =>
hudi-notebooks/Dockerfile.presto (85%)
copy hudi-notebooks/conf/{trino => presto}/catalog/hive.properties (82%)
copy hudi-notebooks/conf/{trino => presto}/catalog/hudi.properties (85%)
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/BaseDefaultSource.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/BaseWriterCommitMessage.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/internal/DataSourceInternalWriterHelper.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/DefaultSource.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/HoodieBulkInsertDataInternalWriter.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/HoodieBulkInsertDataInternalWriterFactory.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/HoodieDataSourceInternalBatchWrite.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/HoodieDataSourceInternalBatchWriteBuilder.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/HoodieDataSourceInternalTable.java
delete mode 100644
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/spark/internal/HoodieWriterCommitMessage.java
delete mode 100644
hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/spark/internal/TestHoodieBulkInsertDataInternalWriter.java
delete mode 100644
hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/spark/internal/TestHoodieDataSourceInternalBatchWrite.java
create mode 100644
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestSlashSeparatedPartitionValue.scala
delete mode 100644
hudi-spark-datasource/hudi-spark4-common/src/test/java/org/apache/hudi/variant/TestSpark4VariantShreddingProvider.java
create mode 100644 rfc/rfc-99/rfc-99.md
create mode 100755 scripts/jacoco/generate_merged_coverage_report.sh