[
https://issues.apache.org/jira/browse/GOBBLIN-1703?focusedWorklogId=807229&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-807229
]
ASF GitHub Bot logged work on GOBBLIN-1703:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 09/Sep/22 00:16
Start Date: 09/Sep/22 00:16
Worklog Time Spent: 10m
Work Description: ZihanLi58 commented on code in PR #3550:
URL: https://github.com/apache/gobblin/pull/3550#discussion_r966518471
##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java:
##########
@@ -335,7 +335,9 @@ public AddSpecResponse onAddSpec(Spec addedSpec) {
if (quotaManager.isPresent()) {
// QuotaManager has idempotent checks for a dagNode, so this check
won't double add quotas for a flow in the DagManager
try {
+ // todo : we should probably check quota for all of the start nodes
Review Comment:
I think quota is at flow level but not job level? If that's the case,
checking once should be enough?
##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/DagManager.java:
##########
@@ -904,7 +905,11 @@ private void submitJob(DagNode<JobExecutionPlan> dagNode) {
// Run this spec on selected executor
SpecProducer<Spec> producer;
try {
- quotaManager.checkQuota(dagNode);
+ if
(!Boolean.parseBoolean(dagNode.getValue().getJobSpec().getConfigAsProperties().getProperty(
+ ServiceConfigKeys.GOBBLIN_SERVICE_ADHOC_FLOW, "false"))) {
+ quotaManager.checkQuota(dagNode);
Review Comment:
Wondering do you want to put the flow id into running map of quota manager
without increasing quota even it's adhoc flow? I feel it's necessary or for
the next job in the dag, when you call checkQuota, you will increase the quota
again? Am I missing anything here?
##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java:
##########
@@ -332,10 +332,13 @@ public AddSpecResponse onAddSpec(Spec addedSpec) {
// Check quota limits against run immediately flows or adhoc flows before
saving the schedule
// In warm standby mode, this quota check will happen on restli API layer
when we accept the flow
if (!this.warmStandbyEnabled &&
(!jobConfig.containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY) ||
PropertiesUtils.getPropAsBoolean(jobConfig,
ConfigurationKeys.FLOW_RUN_IMMEDIATELY, "false"))) {
+ // This block should be reachable only for the first execution for the
adhoc flows (flows that either do not have a schedule or have
runImmediately=true.
Review Comment:
Just a reminder: As on line 334 we checked whether this.warmStandbyEnabled
should set to be false, this will only be reached in the old mode, where we
didn't enable message forwarding.
You should check quota on the compiler onAddSpec method when we enable
message forwarding
##########
gobblin-service/src/main/java/org/apache/gobblin/service/modules/orchestration/MysqlUserQuotaManager.java:
##########
@@ -83,15 +83,15 @@ int getCount(String name, CountType countType) throws
IOException {
}
/**
- * Creating an instance of StateStore.
+ * Creating an instance of MysqlQuotaStore.
*/
protected MysqlQuotaStore createQuotaStore(Config config) throws IOException
{
- String stateStoreTableName = ConfigUtils.getString(config,
ConfigurationKeys.STATE_STORE_DB_TABLE_KEY,
- ConfigurationKeys.DEFAULT_STATE_STORE_DB_TABLE);
+ String quotaStoreTableName = ConfigUtils.getString(config,
ServiceConfigKeys.QUOTA_STORE_DB_TABLE_KEY,
+ ServiceConfigKeys.DEFAULT_QUOTA_STORE_DB_TABLE);
BasicDataSource basicDataSource = MysqlStateStore.newDataSource(config);
Review Comment:
In this way, you will use "state.store.db.url" as the key to mysql db, which
is too general. I will suggest to check how mysqlSpecStore handle that, we
should have a config prefix to get the config related to this mysql quota
manager.
Issue Time Tracking
-------------------
Worklog Id: (was: 807229)
Remaining Estimate: 0h
Time Spent: 10m
> avoid double quota usage increment for ad hoc flows
> ---------------------------------------------------
>
> Key: GOBBLIN-1703
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1703
> Project: Apache Gobblin
> Issue Type: Bug
> Reporter: Arjun Singh Bora
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When a gaas flow request comes, resource handler checks the quota right there.
> However, if the flow has runImmediately=true, the quota will be checked again
> when the first job starts. This should be avoided.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)