codope commented on code in PR #7929: URL: https://github.com/apache/hudi/pull/7929#discussion_r1105412713
########## website/docs/hoodie_deltastreamer.md: ########## @@ -340,6 +388,26 @@ to trigger/processing of new or changed data as soon as it is available on S3. Insert code sample from this blog: https://hudi.apache.org/blog/2021/08/23/s3-events-source/#configuration-and-setup +### GCS Events +Google Cloud Storage (GCS) service provides an event notification mechanism which will post notifications when certain +events happen in your GCS bucket. You can read more at [Pubsub Notifications](https://cloud.google.com/storage/docs/pubsub-notifications/). +GCS will put these events in a Cloud Pubsub topic. Apache Hudi provides a GcsEventsSource that can read from Cloud Pubsub +to trigger/processing of new or changed data as soon as it is available on GCS. + +#### Setup +A detailed guide on [How to use the system](https://docs.google.com/document/d/1VfvtdvhXw6oEHPgZ_4Be2rkPxIzE0kBCNUiVDsXnSAA/edit#heading=h.tpmqk5oj0crt) is available. +A high level overview of the same is provided below. +1. Configure Cloud Storage Pubsub Notifications for the bucket. Follow Google’s documentation here: [https://cloud.google.com/storage/docs/reporting-changes](reporting changes) +2. Create a Pubsub subscription corresponding to the topic +3. Note the GCS Project Id, the GCS Subscription Id and use them for the following Hoodie configurations: + 1. hoodie.deltastreamer.source.gcs.project.id=GCP_PROJECT_ID + 2. hoodie.deltastreamer.source.gcs.subscription.id=SUSBCRIPTION_ID + 3. Start the GcsEventsSource using the `HoodieDeltaStreamer` utility with --source-class parameter as + `org.apache.hudi.utilities.sources.GcsEventsSource` and hoodie.deltastreamer.source.cloud.meta.ack=true, and path related + configs as described in the detailed guide mentiond above. +4. Start the GcsEventsSource using the `HoodieDeltaStreamer` utility with --source-class parameter as Review Comment: Fixed. Thanks for pointint out. ########## website/docs/hoodie_deltastreamer.md: ########## @@ -340,6 +388,26 @@ to trigger/processing of new or changed data as soon as it is available on S3. Insert code sample from this blog: https://hudi.apache.org/blog/2021/08/23/s3-events-source/#configuration-and-setup +### GCS Events +Google Cloud Storage (GCS) service provides an event notification mechanism which will post notifications when certain +events happen in your GCS bucket. You can read more at [Pubsub Notifications](https://cloud.google.com/storage/docs/pubsub-notifications/). +GCS will put these events in a Cloud Pubsub topic. Apache Hudi provides a GcsEventsSource that can read from Cloud Pubsub +to trigger/processing of new or changed data as soon as it is available on GCS. + +#### Setup +A detailed guide on [How to use the system](https://docs.google.com/document/d/1VfvtdvhXw6oEHPgZ_4Be2rkPxIzE0kBCNUiVDsXnSAA/edit#heading=h.tpmqk5oj0crt) is available. +A high level overview of the same is provided below. +1. Configure Cloud Storage Pubsub Notifications for the bucket. Follow Google’s documentation here: [https://cloud.google.com/storage/docs/reporting-changes](reporting changes) +2. Create a Pubsub subscription corresponding to the topic +3. Note the GCS Project Id, the GCS Subscription Id and use them for the following Hoodie configurations: + 1. hoodie.deltastreamer.source.gcs.project.id=GCP_PROJECT_ID + 2. hoodie.deltastreamer.source.gcs.subscription.id=SUSBCRIPTION_ID + 3. Start the GcsEventsSource using the `HoodieDeltaStreamer` utility with --source-class parameter as + `org.apache.hudi.utilities.sources.GcsEventsSource` and hoodie.deltastreamer.source.cloud.meta.ack=true, and path related + configs as described in the detailed guide mentiond above. +4. Start the GcsEventsSource using the `HoodieDeltaStreamer` utility with --source-class parameter as Review Comment: Fixed. Thanks for pointing out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org