This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push: new d283192def4 added link and command (#10293) d283192def4 is described below commit d283192def43a7bc9009db877933def237fec1c2 Author: Sagar Lakshmipathy <18vidhyasa...@gmail.com> AuthorDate: Tue Dec 12 05:33:14 2023 -0800 added link and command (#10293) --- website/docs/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.12.0/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.12.1/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.12.2/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.12.3/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.13.0/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.13.1/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ .../version-0.14.0/syncing_aws_glue_data_catalog.md | 15 +++++++++++++++ 8 files changed, 120 insertions(+) diff --git a/website/docs/syncing_aws_glue_data_catalog.md b/website/docs/syncing_aws_glue_data_catalog.md index 3ab47deeab7..e54c6d52887 100644 --- a/website/docs/syncing_aws_glue_data_catalog.md +++ b/website/docs/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md index 0d9075993ec..1228c0b21c4 100644 --- a/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.12.0/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md index 0d9075993ec..1228c0b21c4 100644 --- a/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.12.1/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md index 0d9075993ec..1228c0b21c4 100644 --- a/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.12.2/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md index 0d9075993ec..1228c0b21c4 100644 --- a/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.12.3/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md index 0d9075993ec..1228c0b21c4 100644 --- a/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.13.0/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md index 0d9075993ec..1228c0b21c4 100644 --- a/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.13.1/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file diff --git a/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md b/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md index 3ab47deeab7..e54c6d52887 100644 --- a/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md +++ b/website/versioned_docs/version-0.14.0/syncing_aws_glue_data_catalog.md @@ -16,3 +16,18 @@ be passed along. ```shell --sync-tool-classes org.apache.hudi.aws.sync.AwsGlueCatalogSyncTool ``` + +#### Running AWS Glue Catalog Sync for Spark DataSource + +To write a Hudi table to Amazon S3 and catalog it in AWS Glue Data Catalog, you can use the options mentioned in the +[AWS documentation](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-hudi.html#aws-glue-programming-etl-format-hudi-write) + +#### Running AWS Glue Catalog Sync from EMR + +If you're running HiveSyncTool on an EMR cluster backed by Glue Data Catalog as external metastore, you can simply run the sync from command line like below: + +```shell +cd /usr/lib/hudi/bin + +./run_sync_tool.sh --base-path s3://<bucket_name>/<prefix>/<table_name> --database <database_name> --table <table_name> --partitioned-by <column_name> +``` \ No newline at end of file