Aditya Pratap Singh created GOBBLIN-2092:
--------------------------------------------

             Summary: `carbon get flow-configs` search facets not consistently 
working
                 Key: GOBBLIN-2092
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2092
             Project: Apache Gobblin
          Issue Type: Bug
            Reporter: Aditya Pratap Singh


The `carbon get flow-configs (search) seems to inconsistently apply search 
facets.  it's possible that the facet indices are correctly set up during flow 
creation, but are not properly maintained during flow update.   see interaction:
{code:java}
$ carbon get flow-configs -f prod-lva1 -s war-oh-iceberg | jq -c . | wc -l
Searching [fabric: prod-lva1] for flow matching - (flow_group: None; flow_name: 
None; template_uri: None; proxy_user: None; source_identifier: war-oh-iceberg; 
destination_identifier: None; cron_schedule: None; run_immediately: None; 
owning_group: None; start: None; count: None)
      63

$ carbon get flow-configs -f prod-lva1 -s war-tl-iceberg | jq -c . | wc -l
Searching [fabric: prod-lva1] for flow matching - (flow_group: None; flow_name: 
None; template_uri: None; proxy_user: None; source_identifier: war-tl-iceberg; 
destination_identifier: None; cron_schedule: None; run_immediately: None; 
owning_group: None; start: None; count: None)
No flows found
       0
{code}
so (at least some) results for sourceIdentifier of `war-oh-iceberg` do show, 
but none do for `war-tl-iceberg`.  that's incorrect, because when I instead 
search by user, there are at least two `war-tl-iceberg` flows in `prod-lva1`:
{code:java}
$ carbon get flow-configs -f prod-lva1 -u lyndarel | jq -c '{flowGroup: 
.id.flowGroup, flowName: .id.flowName, user: .properties."user.to.proxy", 
between: (.properties."gobblin.flow.sourceIdentifier" + " => " + 
.properties."gobblin.flow.destinationIdentifier")}' | grep tl-iceberg
Searching [fabric: prod-lva1] for flow matching - (flow_group: None; flow_name: 
None; template_uri: None; proxy_user: lyndarel; source_identifier: None; 
destination_identifier: None; cron_schedule: None; run_immediately: None; 
owning_group: None; start: None; count: None)
{"flowGroup":"iceberg_based_openhouse_replication_u_lyndarel","flowName":"copy_to_holdem_replication_course_features","user":"lyndarel","between":"war-tl-iceberg
 => holdem-tl-iceberg"}
{"flowGroup":"iceberg_based_openhouse_replication_u_lyndarel","flowName":"copy_to_holdem_replication_member_skill_gap","user":"lyndarel","between":"war-tl-iceberg
 => holdem-tl-iceberg"} {code}
when the user and sourceId constraint are combined, those two no longer show up:
{code:java}
$ carbon get flow-configs -f prod-lva1 -u lyndarel -s war-tl-iceberg | jq -c 
'{flowGroup: .id.flowGroup, flowName: .id.flowName, user: 
.properties."user.to.proxy", between: 
(.properties."gobblin.flow.sourceIdentifier" + " => " + 
.properties."gobblin.flow.destinationIdentifier")}'
Searching [fabric: prod-lva1] for flow matching - (flow_group: None; flow_name: 
None; template_uri: None; proxy_user: lyndarel; source_identifier: 
war-tl-iceberg; destination_identifier: None; cron_schedule: None; 
run_immediately: None; owning_group: None; start: None; count: None)
No flows found {code}
the reason I suspect flow update as a possible RC is that I had modified these 
two flows to use that sourceId, when they were originally created with another 
one.  e.g. something like:
{code:java}
$ carbon update flow -f prod-lva1 -fg 
iceberg_based_openhouse_replication_u_lyndarel -fn 
copy_to_holdem_replication_member_skill_gap 
properties.gobblin.flow.sourceIdentifier=war-tl-iceberg,properties.gobblin.flow.destinationIdentifier=holdem-tl-iceberg{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to