[jira] [Updated] (KAFKA-9385) Connect cluster: connector task repeat like a splitbrain cluster problem

kaikai.hou (Jira) Wed, 08 Jan 2020 02:45:40 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


kaikai.hou updated KAFKA-9385:
------------------------------
    Description: 
I am using Debezium. And find a task repeat 
problem.[Jump|[https://issues.redhat.com/browse/DBZ-1573?jql=key%20in%20watchedIssues()]]

 

1. I push the Debezium image to our private image repository.

2. Deploy the connect cluster with the following *Deployment Config*：
{code:java}
//代码占位符
apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
metadata:
  annotations:
    openshift.io/generated-by: OpenShiftWebConsole
  creationTimestamp: '2019-10-14T07:45:41Z'
  generation: 29
  labels:
    app: debezium-test-cloud
  name: debezium-test-cloud
  namespace: test
  resourceVersion: '168496156'
  selfLink: >-
    
/apis/apps.openshift.io/v1/namespaces/test/deploymentconfigs/debezium-test-cloud
  uid: 9f4f8f4d-ee56-11e9-a5a1-00163e0e008f
spec:
  replicas: 2
  selector:
    app: debezium-test-cloud
    deploymentconfig: debezium-test-cloud
  strategy:
    activeDeadlineSeconds: 21600
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
      updatePeriodSeconds: 1
    type: Rolling
  template:
    metadata:
      annotations:
        openshift.io/generated-by: OpenShiftWebConsole
      creationTimestamp: null
      labels:
        app: debezium-test-cloud
        deploymentconfig: debezium-test-cloud
    spec:
      containers:
        - env:
            - name: BOOTSTRAP_SERVERS
              value: '192.168.100.228:9092'
            - name: GROUP_ID
              value: test-cloud
            - name: CONFIG_STORAGE_TOPIC
              value: base.test-cloud.config
            - name: OFFSET_STORAGE_TOPIC
              value: base.test-cloud.offset
            - name: STATUS_STORAGE_TOPIC
              value: base.test-cloud.status
            - name: CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE
              value: 'true'
            - name: CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE
              value: 'true'
            - name: CONNECT_PRODUCER_MAX_REQUEST_SIZE
              value: '20971520'
            - name: CONNECT_DATABASE_HISTORY_KAFKA_RECOVERY_POLL_INTERVAL_MS
              value: '1000'
            - name: HEAP_OPTS
              value: '-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0'
          image: 
'registry.cn-hangzhou.aliyuncs.com/eshine/debeziumconnect:1.0.0.Beta2'
          imagePullPolicy: IfNotPresent
          name: debezium-test-cloud
          ports:
            - containerPort: 8083
              protocol: TCP
            - containerPort: 8778
              protocol: TCP
            - containerPort: 9092
              protocol: TCP
            - containerPort: 9779
              protocol: TCP
          resources:
            limits:
              cpu: 400m
              memory: 1Gi
            requests:
              cpu: 200m
              memory: 1Gi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /kafka/config
              name: debezium-test-cloud-1
            - mountPath: /kafka/data
              name: debezium-test-cloud-2
            - mountPath: /kafka/logs
              name: debezium-test-cloud-3
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - emptyDir: {}
          name: debezium-test-cloud-1
        - emptyDir: {}
          name: debezium-test-cloud-2
        - emptyDir: {}
          name: debezium-test-cloud-3
  test: false
  triggers:
    - type: ConfigChange
status:
  availableReplicas: 2
  conditions:
    - lastTransitionTime: '2019-11-25T06:44:30Z'
      lastUpdateTime: '2019-11-25T06:44:44Z'
      message: replication controller "debezium-test-cloud-15" successfully 
rolled out
      reason: NewReplicationControllerAvailable
      status: 'True'
      type: Progressing
    - lastTransitionTime: '2019-12-31T10:06:23Z'
      lastUpdateTime: '2019-12-31T10:06:23Z'
      message: Deployment config has minimum availability.
      status: 'True'
      type: Available
  details:
    causes:
      - type: Manual
    message: manual change
  latestVersion: 15
  observedGeneration: 29
  readyReplicas: 2
  replicas: 2
  unavailableReplicas: 0
  updatedReplicas: 2
{code}
3. Connect cluster in openshift: one service with two pods

4.  

     a). task_connector_1_0 and task_connector_3_0 were running in podA; 
task_connector_2_0 was running in PodB

     b) Then, PodA console follows error log:  In attachment 
"12_31_d8c7j_1.jpg" 

        !12_31_d8c7j_1.jpg!

     c) Then, Rebalance started;

     d) However, In PodB, all task (task_connector_1_0, task_connector_2_0, 
task_connector_3_0) are running.  In PodA, still task_connector_1_0 and 
task_connector_3_0.

     e) So the repeat task appeared.

 

    

  was:
I am using Debezium. And find a task repeat 
problem.[Jump|[https://issues.redhat.com/browse/DBZ-1573?jql=key%20in%20watchedIssues()]]

 

1. I push the Debezium image to our private image repository.

2. Deploy the connect cluster with the following *Deployment Config*：
{code:java}
//代码占位符
apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
metadata:
  annotations:
    openshift.io/generated-by: OpenShiftWebConsole
  creationTimestamp: '2019-10-14T07:45:41Z'
  generation: 29
  labels:
    app: debezium-test-cloud
  name: debezium-test-cloud
  namespace: test
  resourceVersion: '168496156'
  selfLink: >-
    
/apis/apps.openshift.io/v1/namespaces/test/deploymentconfigs/debezium-test-cloud
  uid: 9f4f8f4d-ee56-11e9-a5a1-00163e0e008f
spec:
  replicas: 2
  selector:
    app: debezium-test-cloud
    deploymentconfig: debezium-test-cloud
  strategy:
    activeDeadlineSeconds: 21600
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
      updatePeriodSeconds: 1
    type: Rolling
  template:
    metadata:
      annotations:
        openshift.io/generated-by: OpenShiftWebConsole
      creationTimestamp: null
      labels:
        app: debezium-test-cloud
        deploymentconfig: debezium-test-cloud
    spec:
      containers:
        - env:
            - name: BOOTSTRAP_SERVERS
              value: '192.168.100.228:9092'
            - name: GROUP_ID
              value: test-cloud
            - name: CONFIG_STORAGE_TOPIC
              value: base.test-cloud.config
            - name: OFFSET_STORAGE_TOPIC
              value: base.test-cloud.offset
            - name: STATUS_STORAGE_TOPIC
              value: base.test-cloud.status
            - name: CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE
              value: 'true'
            - name: CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE
              value: 'true'
            - name: CONNECT_PRODUCER_MAX_REQUEST_SIZE
              value: '20971520'
            - name: CONNECT_DATABASE_HISTORY_KAFKA_RECOVERY_POLL_INTERVAL_MS
              value: '1000'
            - name: HEAP_OPTS
              value: '-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0'
          image: 
'registry.cn-hangzhou.aliyuncs.com/eshine/debeziumconnect:1.0.0.Beta2'
          imagePullPolicy: IfNotPresent
          name: debezium-test-cloud
          ports:
            - containerPort: 8083
              protocol: TCP
            - containerPort: 8778
              protocol: TCP
            - containerPort: 9092
              protocol: TCP
            - containerPort: 9779
              protocol: TCP
          resources:
            limits:
              cpu: 400m
              memory: 1Gi
            requests:
              cpu: 200m
              memory: 1Gi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /kafka/config
              name: debezium-test-cloud-1
            - mountPath: /kafka/data
              name: debezium-test-cloud-2
            - mountPath: /kafka/logs
              name: debezium-test-cloud-3
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - emptyDir: {}
          name: debezium-test-cloud-1
        - emptyDir: {}
          name: debezium-test-cloud-2
        - emptyDir: {}
          name: debezium-test-cloud-3
  test: false
  triggers:
    - type: ConfigChange
status:
  availableReplicas: 2
  conditions:
    - lastTransitionTime: '2019-11-25T06:44:30Z'
      lastUpdateTime: '2019-11-25T06:44:44Z'
      message: replication controller "debezium-test-cloud-15" successfully 
rolled out
      reason: NewReplicationControllerAvailable
      status: 'True'
      type: Progressing
    - lastTransitionTime: '2019-12-31T10:06:23Z'
      lastUpdateTime: '2019-12-31T10:06:23Z'
      message: Deployment config has minimum availability.
      status: 'True'
      type: Available
  details:
    causes:
      - type: Manual
    message: manual change
  latestVersion: 15
  observedGeneration: 29
  readyReplicas: 2
  replicas: 2
  unavailableReplicas: 0
  updatedReplicas: 2
{code}
3. Connect cluster in openshift: one service with two pods

4.  

     a). task_connector_1_0 and task_connector_3_0 were running in podA; 
task_connector_2_0 was running in PodB

     b) Then, PodA console follows error log:  In attachment 
"12_31_d8c7j_1.jpg" 

     c) Then, Rebalance started;

     d) However, In PodB, all task (task_connector_1_0, task_connector_2_0, 
task_connector_3_0) are running.  In PodA, still task_connector_1_0 and 
task_connector_3_0.

     e) So the repeat task appeared.

 

    


> Connect cluster: connector task repeat like a splitbrain cluster problem 
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-9385
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9385
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>            Reporter: kaikai.hou
>            Priority: Major
>         Attachments: 12_31_d8c7j_1.jpg
>
>
> I am using Debezium. And find a task repeat 
> problem.[Jump|[https://issues.redhat.com/browse/DBZ-1573?jql=key%20in%20watchedIssues()]]
>  
> 1. I push the Debezium image to our private image repository.
> 2. Deploy the connect cluster with the following *Deployment Config*：
> {code:java}
> //代码占位符
> apiVersion: apps.openshift.io/v1
> kind: DeploymentConfig
> metadata:
>   annotations:
>     openshift.io/generated-by: OpenShiftWebConsole
>   creationTimestamp: '2019-10-14T07:45:41Z'
>   generation: 29
>   labels:
>     app: debezium-test-cloud
>   name: debezium-test-cloud
>   namespace: test
>   resourceVersion: '168496156'
>   selfLink: >-
>     
> /apis/apps.openshift.io/v1/namespaces/test/deploymentconfigs/debezium-test-cloud
>   uid: 9f4f8f4d-ee56-11e9-a5a1-00163e0e008f
> spec:
>   replicas: 2
>   selector:
>     app: debezium-test-cloud
>     deploymentconfig: debezium-test-cloud
>   strategy:
>     activeDeadlineSeconds: 21600
>     resources: {}
>     rollingParams:
>       intervalSeconds: 1
>       maxSurge: 25%
>       maxUnavailable: 25%
>       timeoutSeconds: 600
>       updatePeriodSeconds: 1
>     type: Rolling
>   template:
>     metadata:
>       annotations:
>         openshift.io/generated-by: OpenShiftWebConsole
>       creationTimestamp: null
>       labels:
>         app: debezium-test-cloud
>         deploymentconfig: debezium-test-cloud
>     spec:
>       containers:
>         - env:
>             - name: BOOTSTRAP_SERVERS
>               value: '192.168.100.228:9092'
>             - name: GROUP_ID
>               value: test-cloud
>             - name: CONFIG_STORAGE_TOPIC
>               value: base.test-cloud.config
>             - name: OFFSET_STORAGE_TOPIC
>               value: base.test-cloud.offset
>             - name: STATUS_STORAGE_TOPIC
>               value: base.test-cloud.status
>             - name: CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE
>               value: 'true'
>             - name: CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE
>               value: 'true'
>             - name: CONNECT_PRODUCER_MAX_REQUEST_SIZE
>               value: '20971520'
>             - name: CONNECT_DATABASE_HISTORY_KAFKA_RECOVERY_POLL_INTERVAL_MS
>               value: '1000'
>             - name: HEAP_OPTS
>               value: '-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0'
>           image: 
> 'registry.cn-hangzhou.aliyuncs.com/eshine/debeziumconnect:1.0.0.Beta2'
>           imagePullPolicy: IfNotPresent
>           name: debezium-test-cloud
>           ports:
>             - containerPort: 8083
>               protocol: TCP
>             - containerPort: 8778
>               protocol: TCP
>             - containerPort: 9092
>               protocol: TCP
>             - containerPort: 9779
>               protocol: TCP
>           resources:
>             limits:
>               cpu: 400m
>               memory: 1Gi
>             requests:
>               cpu: 200m
>               memory: 1Gi
>           terminationMessagePath: /dev/termination-log
>           terminationMessagePolicy: File
>           volumeMounts:
>             - mountPath: /kafka/config
>               name: debezium-test-cloud-1
>             - mountPath: /kafka/data
>               name: debezium-test-cloud-2
>             - mountPath: /kafka/logs
>               name: debezium-test-cloud-3
>       dnsPolicy: ClusterFirst
>       restartPolicy: Always
>       schedulerName: default-scheduler
>       securityContext: {}
>       terminationGracePeriodSeconds: 30
>       volumes:
>         - emptyDir: {}
>           name: debezium-test-cloud-1
>         - emptyDir: {}
>           name: debezium-test-cloud-2
>         - emptyDir: {}
>           name: debezium-test-cloud-3
>   test: false
>   triggers:
>     - type: ConfigChange
> status:
>   availableReplicas: 2
>   conditions:
>     - lastTransitionTime: '2019-11-25T06:44:30Z'
>       lastUpdateTime: '2019-11-25T06:44:44Z'
>       message: replication controller "debezium-test-cloud-15" successfully 
> rolled out
>       reason: NewReplicationControllerAvailable
>       status: 'True'
>       type: Progressing
>     - lastTransitionTime: '2019-12-31T10:06:23Z'
>       lastUpdateTime: '2019-12-31T10:06:23Z'
>       message: Deployment config has minimum availability.
>       status: 'True'
>       type: Available
>   details:
>     causes:
>       - type: Manual
>     message: manual change
>   latestVersion: 15
>   observedGeneration: 29
>   readyReplicas: 2
>   replicas: 2
>   unavailableReplicas: 0
>   updatedReplicas: 2
> {code}
> 3. Connect cluster in openshift: one service with two pods
> 4.  
>      a). task_connector_1_0 and task_connector_3_0 were running in podA; 
> task_connector_2_0 was running in PodB
>      b) Then, PodA console follows error log:  In attachment 
> "12_31_d8c7j_1.jpg" 
>         !12_31_d8c7j_1.jpg!
>      c) Then, Rebalance started;
>      d) However, In PodB, all task (task_connector_1_0, task_connector_2_0, 
> task_connector_3_0) are running.  In PodA, still task_connector_1_0 and 
> task_connector_3_0.
>      e) So the repeat task appeared.
>  
>     



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (KAFKA-9385) Connect cluster: connector task repeat like a splitbrain cluster problem

Reply via email to