[Impala-ASF-CR] WIP IMPALA-12156: Support High Availability for Statestore

2023-08-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20372 )

Change subject: WIP IMPALA-12156: Support High Availability for Statestore
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13798/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 22 Aug 2023 05:09:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12156: Support High Availability for Statestore

2023-08-21 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20372 )

Change subject: WIP IMPALA-12156: Support High Availability for Statestore
..

WIP IMPALA-12156: Support High Availability for Statestore

To support statestore HA, we allow two statestored instances in an
Active-Passive HA pair to be added to an Impala cluster. We add the
preemptive behavior for statestored. When HA is enabled, the preemptive
behavior allows the statestored with the higher priority to become
active and the paired statestored becomes standby. The active
statestored acts as the owner of Impala cluster and provides statestore
service for the cluster members.

To enable catalog HA for a cluster, two statestoreds in the HA pair and
all subscribers must be started with starting flag
"enable_statestored_ha".

- Define new service for Statestore HA.
- Negotiate role for HA with peer statestore instance on startup.
- Create HA monitor thread:
  active statestored send heartbeat to standby statestored.
  standby statestored monitor peer's connection states
- Standby statestored send heartbeat to subscribers with request
  for connection state between active statestore and subscribers.
  Standby statestored save the connection state as failure detecer.
- When standby statestored loss connection with active statestore,
  check the connection states for active statestore, and take over
  active role if majority of subscribers lose connections with active
  statestore.
- New active statestored send RPC notification to all subscribers
  for new active statestored and active catalogd elected by new
  active staetstored.
- New active statestored start to send heartbeat to its peer when it
  receive handshake from its peer.
- Subscriber register to two statestoreds.
- Subscriber report connection state for in-active statestore.
- Subscriber switch to new active statstore, refuse topic update
  from standby statestored.

Testings:
  TODO

Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
---
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/scheduling/admissiond-env.cc
M be/src/statestore/statestore-service-client-wrapper.h
M be/src/statestore/statestore-subscriber-catalog.cc
M be/src/statestore/statestore-subscriber-catalog.h
M be/src/statestore/statestore-subscriber-client-wrapper.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore-test.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M bin/start-impala-cluster.py
M common/thrift/StatestoreService.thrift
M common/thrift/metrics.json
M tests/common/impala_cluster.py
M tests/common/impala_service.py
21 files changed, 1,487 insertions(+), 87 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/20372/2
--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] WIP IMPALA-12156: Support High Availability for Statestore

2023-08-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20372 )

Change subject: WIP IMPALA-12156: Support High Availability for Statestore
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/20372/2/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/20372/2/bin/start-impala-cluster.py@355
PS2, Line 355: def statestored_service_name(i):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/20372/2/bin/start-impala-cluster.py@791
PS2, Line 791: D
flake8: F602 dictionary key variable DEFAULT_STATESTORE_HA_SERVICE_PORT 
repeated with different values


http://gerrit.cloudera.org:8080/#/c/20372/2/bin/start-impala-cluster.py@792
PS2, Line 792: D
flake8: F602 dictionary key variable DEFAULT_STATESTORE_HA_SERVICE_PORT 
repeated with different values



--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 22 Aug 2023 04:45:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-12156: Support High Availability for Statestore

2023-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20372 )

Change subject: WIP IMPALA-12156: Support High Availability for Statestore
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/13765/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 16 Aug 2023 20:24:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-12156: Support High Availability for Statestore

2023-08-16 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20372


Change subject: WIP IMPALA-12156: Support High Availability for Statestore
..

WIP IMPALA-12156: Support High Availability for Statestore

This patch adds support for Statestore High Availability
- Define new service for Statestore HA.
- Negotiate role for HA with peer statestore instance on startup.
- Create HA monitor thread:
  active statestored send heartbeat to standby statestored.
  standby statestored monitor peer's connection states
- Standby statestored send heartbeat to subscribers with request
  for connection state between active statestore and subscribers.
  Standby statestored save the connection state as failure detecer.
- When standby statestored loss connection with active statestore,
  check the connection states for active statestore, and take over
  active role if majority of subscribers lose connections with active
  statestore.
- New active statestored send RPC notification to all subscribers
  for new active statestored and active catalogd elected by new
  active staetstored.
- New active statestored start to send heartbeat to its peer when it
  receive handshake from its peer.
- Subscriber register to two statestoreds.
- Subscriber report connection state for in-active statestore.
- Subscriber switch to new active statstore, refuse topic update
  from standby statestored.

Testings:
  TODO

Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
---
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/scheduling/admissiond-env.cc
M be/src/statestore/statestore-service-client-wrapper.h
M be/src/statestore/statestore-subscriber-catalog.cc
M be/src/statestore/statestore-subscriber-catalog.h
M be/src/statestore/statestore-subscriber-client-wrapper.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore-test.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M bin/start-impala-cluster.py
M common/thrift/StatestoreService.thrift
M tests/common/impala_cluster.py
M tests/common/impala_service.py
20 files changed, 1,178 insertions(+), 85 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/20372/1
--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 


[Impala-ASF-CR] WIP IMPALA-12156: Support High Availability for Statestore

2023-08-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20372 )

Change subject: WIP IMPALA-12156: Support High Availability for Statestore
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20372/1/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/20372/1/be/src/common/global-flags.cc@410
PS1, Line 410: DEFINE_int64(update_statestored_rpc_resend_interval_ms, 100, 
"(Advanced) Interval (in ms) "
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/20372/1/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/20372/1/be/src/statestore/statestore.cc@1705
PS1, Line 1705:   
RETURN_IF_ERROR(Thread::Create("statestore-update-satestored", 
"update-statestored-thread",
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/20372/1/be/src/statestore/statestore.cc@1742
PS1, Line 1742: DebugAction(FLAGS_debug_actions, 
"STATESTORE_HA_HANDSHAKE_FIRST_ATTEMPT") :
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/20372/1/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/20372/1/bin/start-impala-cluster.py@634
PS1, Line 634:
flake8: E241 multiple spaces after ','



--
To view, visit http://gerrit.cloudera.org:8080/20372
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd2c814bbad5c04c1d50c2edaa5b910c82a6fd87
Gerrit-Change-Number: 20372
Gerrit-PatchSet: 1
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 16 Aug 2023 19:59:46 +
Gerrit-HasComments: Yes