[
https://issues.apache.org/jira/browse/PHOENIX-5521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Viraj Jasani updated PHOENIX-5521:
----------------------------------
Description:
An HBase coprocessor Endpoint hook that takes in a request from a remote
cluster (containing both the WALEdit's data and the WALKey's annotated metadata
telling the remote cluster what tenant_id, logical tablename, and timestamp the
data is associated with).
Ideally the API's message format should be configurable / pluggable, and could
be either a protobuf or an Avro schema similar to the WALEdit-like one
described by PHOENIX-5443. Endpoints in HBase are structured to work with
protobufs, so some conversion may be necessary in an Avro-compatible version.
Future work may also extend this to any conforming schema given by a schema
service such as the one in PHOENIX-5443, which would be useful in allowing
PHOENIX-5442's CDC service to be used as a backup / migration tool.
The endpoint hook would take the metadata + data and regenerate a complete set
of Phoenix mutations, both data and indexes, just as the phoenix client did for
the original SQL statement that generated the source-side edits. These
mutations would be written to the remote cluster by the normal Phoenix write
path.
HBASE-27529 provides regionserver coproc hook to attach WAL extended attributes
to mutations at replication sink. We can utilize this hook and provide
end-to-end flow for Phoenix metadata attributes (tenant id, schema name,
logical table name, table type etc). The source cluster can attach the metadata
attributes to source mutations. By using "phoenix.append.metadata.to.wal", the
attributes can be appended to WAL in the form of extended attributes. By using
a new regionserver coproc in Phoenix, we can utilize HBASE-27529 and allow the
sink cluster to attach the WAL extended attributes to Mutations. This way,
IndexRegionObserver and other coproc endpoints would be able to get Phoenix
metadata attributes in both source and sink clusters.
The changes required to enable replication sink coproc, and allow it to attach
phoenix metadata as Mutation attributes at the Sink cluster:
# Add "org.apache.phoenix.coprocessor.ReplicationSinkEndpoint" to
hbase.coprocessor.regionserver.classes config
# phoenix.append.metadata.to.wal = true
was:
An HBase coprocessor Endpoint hook that takes in a request from a remote
cluster (containing both the WALEdit's data and the WALKey's annotated metadata
telling the remote cluster what tenant_id, logical tablename, and timestamp the
data is associated with).
Ideally the API's message format should be configurable / pluggable, and could
be either a protobuf or an Avro schema similar to the WALEdit-like one
described by PHOENIX-5443. Endpoints in HBase are structured to work with
protobufs, so some conversion may be necessary in an Avro-compatible version.
Future work may also extend this to any conforming schema given by a schema
service such as the one in PHOENIX-5443, which would be useful in allowing
PHOENIX-5442's CDC service to be used as a backup / migration tool.
The endpoint hook would take the metadata + data and regenerate a complete set
of Phoenix mutations, both data and indexes, just as the phoenix client did for
the original SQL statement that generated the source-side edits. These
mutations would be written to the remote cluster by the normal Phoenix write
path.
HBASE-27529 provides regionserver coproc hook to attach WAL extended attributes
to mutations at replication sink. We can utilize this hook and provide
end-to-end flow for Phoenix metadata attributes (tenant id, schema name,
logical table name, table type etc). The source cluster can attach the metadata
attributes to source mutations. By using "phoenix.append.metadata.to.wal", the
attributes can be appended to WAL in the form of extended attributes. By using
a new regionserver coproc in Phoenix, we can utilize HBASE-27529 and allow the
sink cluster to attach the WAL extended attributes to Mutations. This way,
IndexRegionObserver and other coproc endpoints would be able to get Phoenix
metadata attributes in both source and sink clusters.
The changes required to enable replication sink coproc, and allow it to attach
phoenix metadata as Mutation attributes at the Sink cluster:
# Add "org.apache.phoenix.coprocessor.ReplicationSinkEndpoint" to
hbase.coprocessor.regionserver.classes config
# phoenix.append.metadata.to.wal = true
# Use "CHANGE_DETECTION_ENABLED = true" for the given table
> Phoenix-level HBase Replication sink (Endpoint coproc)
> ------------------------------------------------------
>
> Key: PHOENIX-5521
> URL: https://issues.apache.org/jira/browse/PHOENIX-5521
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: Geoffrey Jacoby
> Assignee: Viraj Jasani
> Priority: Major
> Fix For: 5.2.0, 5.1.4
>
>
> An HBase coprocessor Endpoint hook that takes in a request from a remote
> cluster (containing both the WALEdit's data and the WALKey's annotated
> metadata telling the remote cluster what tenant_id, logical tablename, and
> timestamp the data is associated with).
> Ideally the API's message format should be configurable / pluggable, and
> could be either a protobuf or an Avro schema similar to the WALEdit-like one
> described by PHOENIX-5443. Endpoints in HBase are structured to work with
> protobufs, so some conversion may be necessary in an Avro-compatible version.
> Future work may also extend this to any conforming schema given by a schema
> service such as the one in PHOENIX-5443, which would be useful in allowing
> PHOENIX-5442's CDC service to be used as a backup / migration tool.
> The endpoint hook would take the metadata + data and regenerate a complete
> set of Phoenix mutations, both data and indexes, just as the phoenix client
> did for the original SQL statement that generated the source-side edits.
> These mutations would be written to the remote cluster by the normal Phoenix
> write path.
>
> HBASE-27529 provides regionserver coproc hook to attach WAL extended
> attributes to mutations at replication sink. We can utilize this hook and
> provide end-to-end flow for Phoenix metadata attributes (tenant id, schema
> name, logical table name, table type etc). The source cluster can attach the
> metadata attributes to source mutations. By using
> "phoenix.append.metadata.to.wal", the attributes can be appended to WAL in
> the form of extended attributes. By using a new regionserver coproc in
> Phoenix, we can utilize HBASE-27529 and allow the sink cluster to attach the
> WAL extended attributes to Mutations. This way, IndexRegionObserver and other
> coproc endpoints would be able to get Phoenix metadata attributes in both
> source and sink clusters.
>
> The changes required to enable replication sink coproc, and allow it to
> attach phoenix metadata as Mutation attributes at the Sink cluster:
> # Add "org.apache.phoenix.coprocessor.ReplicationSinkEndpoint" to
> hbase.coprocessor.regionserver.classes config
> # phoenix.append.metadata.to.wal = true
--
This message was sent by Atlassian Jira
(v8.20.10#820010)