This is an automated email from the ASF dual-hosted git repository.
ethanfeng pushed a change to branch serverless-spark/release-0.5.4-1.1
in repository https://gitbox.apache.org/repos/asf/celeborn.git
was b35fb5072 [CELEBORN-1792] MemoryManager resume should use
pinnedDirectMemory instead of usedDirectMemory
This change permanently discards the following revisions:
discard b35fb5072 [CELEBORN-1792] MemoryManager resume should use
pinnedDirectMemory instead of usedDirectMemory
discard 4b59ef45f fix #64878410 unify-useridentifier
discard 9707c3ea3 fix #64873863 disable exceed congestion control high
watermark cause worker high workload
discard e57c9b6bb fix #0 format code
discard 79d51bcd5 to 63521722 update helm
discard b2d5ccc27 fix #64570079 scale down should align with cluster expect
scale down number
discard dfee7f00f fix compile
discard 2e1cddab1 Bump 0.5.4
discard ae7bfa17a [CELEBORN-1879] Ignore invalid chunk range generated by
splitSkewedPartitionLocations
discard 8f443e03a [CELEBORN-1885] Fix nullptr exceptions in FetchChunk after
worker restart
discard 4bc41addd [CELEBORN-1883] Replace HashSet with
ConcurrentHashMap.newKeySet for ShuffleFileGroups
discard 7db7f4ad2 [CELEBORN-1757] Add retry when sending RPC to
LifecycleManager
discard c7ee2e7cb [CELEBORN-1865] Update master endpointRef when master leader
is abnormal
discard 2bcad3fed [fix #64528553] shouldn't tracking hardsplit batch Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/20893917
discard cf3fd377d to #63521717 1. memory available check serving state 2. add
new config to pause memory storage from using by memory pressure
discard 6f9ef025d fix#63759651 disable expire app when quota excced
discard 356670467 to 63295277 fix CI error caused by fetchHandler update
discard 748c1394e to 63295277 handle open stream synchronous to asynchronous
discard ef382f88a [CELEBORN-1831] Add ratis commitIndex metrics
discard ceebc4a30 #63675918 memory manager supports dynamic configs
discard ba4893f3b [CELEBORN-1850] Setup worker endpoint after initalizing
controller
discard bdc9ef7ac [CELEBORN-1838] Interrupt spark task should not report fetch
failure
discard 94d1e0b1d [CELEBORN-1701][FOLLOWUP] Support stage rerun for shuffle
data lost
discard ffad317a0 [fix#63334141] skip soft split when push merge data Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/20380569
discard 492577a2c [fix# 63302560] fix quota format
discard aed2dd1ee [CELEBORN-1721][FOLLOWUP] Fix the problem of getting
partition location in ShuffleClientImpl during soft split
discard a32df8053 to #63045080 change proxy enable config item
discard 40e29f4fe to #62556294 update exist region and zone for cluster info
items
discard ad5d23bc7 [CELEBORN-1818] Fix incorrect timeout exception when waiting
on no pending writes
discard 3e2bde086 to #61780839 update docker file
discard 3bae14dc1 fix #62531036 Fix proxy selection uts
discard 2c6018175 to #60512610 Refine selection logic
discard 86ec09a92 [CELEBORN-INNER] support skew shuffle rerun stage
discard fef98f22e [CELEBORN-INNER] important keep compatiable with inner proto
discard 9aea98a98 [CELEBORN-INNER] fix ut compile
discard 628bc3ff6 [CELEBORN-1319] Optimize skew partition logic for Reduce
Mode to avoid sorting shuffle files
discard 99cc9d219 to #62097184 adapt EMR-ZONE environment
discard dc4bdbeca to #61830366 extend proxy apis
discard 778fd027c [CELEBORN-1763] Fix DataPusher be blocked for a long time
discard 4c76d1de6 [CELEBORN-1783] Fix Pending task in commitThreadPool wont be
canceled
discard 6f5bce946 [CELEBORN-1782] Worker in congestion control should be in
blacklist to avoid impact new shuffle
discard 9e5c07531 [CELEBORN-1500] Filter out empty InputStreams
discard 56206952a [CELEBORN-1743] Resolve the metrics data interruption and
the job failure caused by locked resources
discard 561abc161 [CELEBORN-INNER] fix ut
discard a4201a661 [CELEBORN-INNER] disable inner authentication
discard 6bb792c0c [CELEBORN-INNER] change encryption enabled to false
discard 5cbfda3ae [CELEBORN-INNER] disable replicate in ut
discard cbe1c9e22 [CELEBORN-INNER] change test log level to info.
discard 00e57560e [CELEBORN-INNER] fix SortBasedShuffleWriterSuiteJ.
discard b010e2944 [CELEBORN-INNER] fix SortBasedPusherSuiteJ.
discard 5c7abe748 [CELEBORN-INNER] fix APPFlowController when tenantId is null
discard d549cf1dc [CELEBORN-1721][CIP-12] Support HARD_SPLIT in PushMergedData
discard 9c09eb4bd [CELEBORN-1510] Partial task unable to switch to the replica
discard 39c09579a Revert "[CELEBORN-1376] Push data failed should always
release request body"
discard abc1365cc [CELEBORN-1770] FlushNotifier should setException for all
Throwables in Flusher
discard 07058ad04 to #61780839 update master client for celeborn proxy
discard 9652082a6 fix#61780839 Celeborn proxy adapt with Celeborn 0.5.0
discard 408bdf9f6 [CELEBORN-INNER] fix worker index
discard c3cdff580 [CELEBORN-INNER] fix worker index
discard c2de596d7 [CELEBORN-INNER] Support sbt compile inner flink1.17
discard 2645eb0c7 [CELEBORN-INNER] fix appFlowController get tenantId
discard e8a1e2e68 [CELEBORN-INNER] format code
discard 4e2086d0b [CELEBORN-INNER] pass worker ut
discard 75463d7d7 [CELEBORN-INNER] fix master quota/scale ut
discard f75ed161f [CELEBORN-INNER] fix port conflict ut and add new
configuration for constant port
discard f215d24db [CELEBORN-INNER] fix pb resourceConsumption ut
discard 1a16df327 [CELEBORN-INNER] fix reader &ut
discard f9b5f4507 [CELEBORN-INNER] update configuration doc
discard 005de7034 [CELEBORN-INNER] fix sbt for inner
discard 4e0d3578a [fix #61569447] Tenant quota check should according to
MetaSystem Link: https://code.alibaba-inc.com/soe/celeborn/codereview/19529343
* [fix #61569447] Tenant quotacheck should according to MetaSystem
discard 6f3eddeb2 fix 61414176 add dispatcher threads
discard 659c5bcbb [CELEBORN-1769] Fix packed partition location cause
GetReducerFileGroupResponse lose location
discard 861b6a445 [CELEBORN-1765] Fix NPE when removeFileInfo in StorageManager
discard dfa3821f3 [CELEBORN-1760] OOM causes disk buffer unable to be released
discard 3660bc4e6 build arm image in jenkins
discard 6dcc83014 [CELEBORN-INNER] Support decompress parameter for shuffle
reader Link: https://code.alibaba-inc.com/soe/celeborn/codereview/19036201
discard 624db7d85 [CELEBORN-INNER] Add amend shuffle configuration
discard aa789bdad [CELEBORN-INNER] fix compression
discard 9510ac0c2 [CELEBORN-INNER] fix flink use retry client
discard 1202ea5ce [CELEBORN-INNER] Support build client for spark4
本次代码评审主要涉及对Apache
Celeborn项目中多个模块的微小优化和适应性调整,包括更改Scala代码中可选值匹配逻辑,修正版权和许可文件,以及针对Spark
4.0版本的模块配置,同时优化了依赖管理和Netty相关库的阴影处理,增强跨版本兼容性。 Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/18742342 *
[CELEBORN-INNER] Support build client for spark4
discard d97e45835 [CELEBORN-INNER] fix ut
discard 3ece1f408 Allow all compression algorithm for flink 1.17
本次代码评审主要加入了对缓冲区压缩的支持,通过新增`getBufferCompressor`抽象方法及使用`CompressionCodec`,同时针对不同Flink版本实现了特定的压缩处理逻辑,确保了在Flink
1.14和1.15版本中仅支持LZ4压缩算法,并在其他版本中提供了更灵活的压缩选择,同时调整了依赖该压缩逻辑的组件。 Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/18671303
discard 0f065e900 [CELEBORN-INNER][fix #60023899] Check SPARK_LOCAL_IP, if
exists use as Celeborn Local Ip Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/18572679 *
[CELEBORN-INNER] Check SPARK_LOCAL_IP, if exists use as Celeborn Local Ip
discard 45547a073 rpc port bind to 0.0.0.0
discard 6ce08a32d Add http endpoint for liveness probe
discard 022e6ccb0 [CELEBORN-INNER] Fix revise/delete app, add
revise/delete/refresh conf restapi
discard dd8cdfa60 Amend lost shuffles
discard 5214885b4 [CELEBORN-INNER] update doc
discard 24cb83222 [CEELBORN-INNER] if workerHostIDNSEnabled, use getHostName
as hostname instead of fqdn hostname getted by getCanonicalHostName Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/14356758
discard 03fdf9a1e [CELEBORN-1586] Add available workers Metrics
discard 12d156258 [CELEBORN-INNER] Fix flink compile problem.
discard e3e9d1e96 [CELEBORN-INNER] add appid in checkquota Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/17969146
discard ddd4c285c [CELEBORN-INNER][Fix 59287953] Support early quota warning
Link: https://code.alibaba-inc.com/soe/celeborn/codereview/18106758
discard 45cd6f03a Add metrics for master meta usable space and worker recover
usable space
discard 2bee12a15 Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/17993511
discard 017e6f2f5 Migrate worker meta when old dir exists
discard 87d93334e [CELEBORN-INNER] Improve resource consumption and fix
compute application resource consumption. Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/17924384
discard 5f06787e0 [CELEBORN-INNER] Only reformat code
discard 2449b09ba Support migrate recovery path when work start
discard cc74c0203 [CELEBORN-INNER] Fix Compatiable problem for internal port
discard 6acd3332e [CELEBORN-INNER] improve quota manager
discard 2c71bf565 [CELEBORN-INNER] bind to 0.0.0.0 for worker http server
discard 4d76db30f [CELEBORN-INNER] bind to 0.0.0.0 for http server
discard 20174ab7e [CELEBORN-INNER] fix resourceComsuption compute
discard dda8e82fe [CELEBORN-INNER] Fix database password encrypt problem
discard ffb8d030d [CELEBORN-INNER] Support verify worker use product tenant
discard 07ee9d782 [CELEBORN-INNER] Fix merge problem/db/idns/proto
discard e28bf84db [CELEBORN-INNER] Align configuration name
discard 30db7997a [CELEBORN-INNER] fix unregister shuffle
discard 1cdb9d898 [CELEBORN-INNER] support scale up/scale down when workers
not all ready. Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/17554893
discard c0845ad5e [CELEBORN-INNER] fix compatiable problem for
MasterNotLeaderException
discard a34f2fc7f [CELEBORN-INNER] build docker base image Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/17472757
discard a6199be77 [CELEBORN-INNER] add network metrics. Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/17431342 *
[CELEBORN-INNER] support network metrics
discard 83d198fe9 [CELEBORN-INNER] remove system.println sts path
discard 482857a41 [CELEBORN-INNER] Support auto scale. Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/16988043 *
[CELEBORN-INNER][FOLLOWUP] Support auto scale
discard f2fd79893 [CELEBORN-INNER] Support Celeborn auto scale. Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/16587393
discard dbabc3d6c [CELEBORN-INNER][fix #52949890]Support expire app data when
app quota exceed Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/14848880 *
[CELEBORN-INNER][fix #52949890]Support expire app data when app quota exceed
discard 8c3d08f4f [CELEBORN-INNER] Fix jmfailover, worker status Serialize.
discard c638c95b2 [CELEBORN-INNER] Fix jmfailover by replacing
filesystem.getStatus api
discard 9f10ccfb0 [CELEBORN-INNER] Support rass. Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/16598300
discard e302eed36 [CELEBORN-INNER] Remove add metrics when checkQuota.
discard d715fc2d1 [CELEBORN-INNER] remove customClusterId from idns if
hostnameWithClusterId is false Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/16272420
discard cb142041b [CELEBORN-INNER] modify storagetype with WorkerInfo
discard cb8a9a97e Change register Stream log level from info to debug.
discard 3955bbc92 [CELEBORN-INNER] Add custom cluster id to worker host
discard d1deed4f5 [CELEBORN-INNER] Improve legacy message decode, change
loglevel to DEBUG.
discard 4ce8fd5ad [CELEBORN-INNER] Add haclient.MasterNotLeaderException for
compatiable between 0.4 worker to 0.3 master and 0.3 client to 0.4 master
discard 558c69428 [CELEBORN-INNER] Resource.proto change
highWorkload/storageType to optional for upgrade master from 0.3.0 to 0.4.0
discard 8a9b9de06 [CELEBORN-INNER] Change registerStream loglevel from debug
to info.
discard 58dc5605d [CELEBORN-INNER] Add log when status system apply
transaction log encounters exception.
discard 6b3e77d66 [CELEBORN-INNER] Fix compatible issues between worker and
master from 0.3.0-0.4.0.
discard e7f2b86e5 [CELEBORN-INNER] Fix compatible issues with client of 0.3.0.
discard 985ab95fb [CELEBORN-INNER] jobManager fine-grained failover
discard 3e506b6d1 [CELEBORN-INNER] Fix conf & improve log Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/14584987
discard 271197a0a [CELEBORN-INNER] Support encrypted password for database
connection. Link: https://code.alibaba-inc.com/soe/celeborn/codereview/14415218
discard e104c5532 [CELEBORN-INNER] Support dynamic APPFlowController at worker
side Link: https://code.alibaba-inc.com/soe/celeborn/codereview/14510910 *
[CELEBORN-INNER] Support dynamic APPFlowController at worker side
discard 67164a1f7 [CELEBORN-INNER] Support Worker level throttle for Single
Tenant App
discard 6d23f91a1 [CELEBORN-INNER] Retry bind host to wait DNS reconcile Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/14462124
discard a78f564e8 [CELEBORN-INNER] support io encryption Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/14268840
discard 07688a3e8 [CELEBORN-INNER] Support QuotaManager Link:
https://code.alibaba-inc.com/soe/celeborn/codereview/14114182
discard 97c2368fa [CELEBORN-INNER] Support Authentication
discard f80532aa0 [CELEBORN-INNER] Log Exception when master initialize failed
Link: https://code.alibaba-inc.com/soe/celeborn/codereview/14357663
discard 681cd2877 [CELEBORN-INNER] Always reponse
HeartbeatFromApplicationResponse
discard eeed6268e [CELEBORN-INNER]Filter network excetpion log by ip prefix
discard e81ae5e21 [CELEBORN-INNER] Change to inner 1.17 version
discard 21264a525 [FLINK] remove uid and gid from dockerfile.inner
Link: https://code.alibaba-inc.com/soe/celeborn/codereview/12260282