[jira] [Updated] (HUDI-8068) Hook up source partitions to s3 incr source

2024-08-09 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-8068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-8068: -- Priority: Trivial (was: Major) > Hook up source partitions to s3 incr source >

[jira] [Created] (HUDI-8068) Hook up source partitions to s3 incr source

2024-08-09 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-8068: - Summary: Hook up source partitions to s3 incr source Key: HUDI-8068 URL: https://issues.apache.org/jira/browse/HUDI-8068 Project: Apache Hudi Issue Type: I

[jira] [Created] (HUDI-7940) Pass metrics to ErrorTableWriter to be able to emit metrics for Error Table

2024-06-29 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7940: - Summary: Pass metrics to ErrorTableWriter to be able to emit metrics for Error Table Key: HUDI-7940 URL: https://issues.apache.org/jira/browse/HUDI-7940 Project: Ap

[jira] [Assigned] (HUDI-7855) Add ability to dynamically configure write parallelism for BULK_INSERT for HoodieStreamer

2024-06-10 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7855: - Assignee: Rajesh Mahindra > Add ability to dynamically configure write parallelism for BU

[jira] [Created] (HUDI-7855) Add ability to dynamically configure write parallelism for BULK_INSERT for HoodieStreamer

2024-06-10 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7855: - Summary: Add ability to dynamically configure write parallelism for BULK_INSERT for HoodieStreamer Key: HUDI-7855 URL: https://issues.apache.org/jira/browse/HUDI-7855

[jira] [Created] (HUDI-7816) Pass the source profile to the snapshot query splitter

2024-05-30 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7816: - Summary: Pass the source profile to the snapshot query splitter Key: HUDI-7816 URL: https://issues.apache.org/jira/browse/HUDI-7816 Project: Apache Hudi Is

[jira] [Created] (HUDI-7606) Ensure that rdds persisted by table services are released in SparkRDDWriteClient

2024-04-11 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7606: - Summary: Ensure that rdds persisted by table services are released in SparkRDDWriteClient Key: HUDI-7606 URL: https://issues.apache.org/jira/browse/HUDI-7606 Projec

[jira] [Assigned] (HUDI-7606) Ensure that rdds persisted by table services are released in SparkRDDWriteClient

2024-04-11 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7606: - Assignee: Rajesh Mahindra > Ensure that rdds persisted by table services are released in

[jira] [Created] (HUDI-7517) Add ability to reset the checkpoint for kafka source

2024-03-19 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7517: - Summary: Add ability to reset the checkpoint for kafka source Key: HUDI-7517 URL: https://issues.apache.org/jira/browse/HUDI-7517 Project: Apache Hudi Issu

[jira] [Closed] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-19 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra closed HUDI-7418. - Resolution: Fixed > Implement file extension filter for s3 incr source > -

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Description: We have support for filter the input files based on an extension (custom) for GCS I

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Description: We have support for filter the input files based on an extension (custom) that can

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Priority: Trivial (was: Major) > Implement file extension filter for s3 incr source > -

[jira] [Created] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7418: - Summary: Implement file extension filter for s3 incr source Key: HUDI-7418 URL: https://issues.apache.org/jira/browse/HUDI-7418 Project: Apache Hudi Issue

[jira] [Updated] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7418: -- Sprint: Sprint 2023-03-28 > Implement file extension filter for s3 incr source > ---

[jira] [Assigned] (HUDI-7418) Implement file extension filter for s3 incr source

2024-02-18 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7418: - Assignee: Rajesh Mahindra > Implement file extension filter for s3 incr source >

[jira] [Updated] (HUDI-7381) Compaction not filling in stats for create and upsert time

2024-02-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7381: -- Priority: Minor (was: Major) > Compaction not filling in stats for create and upsert time > ---

[jira] [Created] (HUDI-7381) Compaction not filling in stats for create and upsert time

2024-02-04 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7381: - Summary: Compaction not filling in stats for create and upsert time Key: HUDI-7381 URL: https://issues.apache.org/jira/browse/HUDI-7381 Project: Apache Hudi

[jira] [Assigned] (HUDI-7381) Compaction not filling in stats for create and upsert time

2024-02-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7381: - Assignee: Rajesh Mahindra > Compaction not filling in stats for create and upsert time >

[jira] [Created] (HUDI-7161) Add commit action type and ext ra metadata to write callback on commit message

2023-11-29 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7161: - Summary: Add commit action type and ext ra metadata to write callback on commit message Key: HUDI-7161 URL: https://issues.apache.org/jira/browse/HUDI-7161 Project:

[jira] [Assigned] (HUDI-7161) Add commit action type and ext ra metadata to write callback on commit message

2023-11-29 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7161: - Assignee: Rajesh Mahindra > Add commit action type and ext ra metadata to write callback

[jira] [Assigned] (HUDI-7138) Fix instantiation issues with ErrorTableWriter and Schema Registry Provider

2023-11-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7138: - Assignee: Rajesh Mahindra > Fix instantiation issues with ErrorTableWriter and Schema Reg

[jira] [Created] (HUDI-7138) Fix instantiation issues with ErrorTableWriter and Schema Registry Provider

2023-11-25 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7138: - Summary: Fix instantiation issues with ErrorTableWriter and Schema Registry Provider Key: HUDI-7138 URL: https://issues.apache.org/jira/browse/HUDI-7138 Project: Ap

[jira] [Created] (HUDI-7108) Ensure schema is refreshed for every batch when using KafkaAvroSchemaDeserializer

2023-11-16 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7108: - Summary: Ensure schema is refreshed for every batch when using KafkaAvroSchemaDeserializer Key: HUDI-7108 URL: https://issues.apache.org/jira/browse/HUDI-7108 Proje

[jira] [Updated] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7106: -- Priority: Critical (was: Major) > Fix SQS deletes logic for S3 events source. > ---

[jira] [Assigned] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7106: - Assignee: Rajesh Mahindra > Fix SQS deletes logic for S3 events source. > ---

[jira] [Updated] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7106: -- Affects Version/s: 0.14.1 > Fix SQS deletes logic for S3 events source. > --

[jira] [Created] (HUDI-7106) Fix SQS deletes logic for S3 events source.

2023-11-16 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7106: - Summary: Fix SQS deletes logic for S3 events source. Key: HUDI-7106 URL: https://issues.apache.org/jira/browse/HUDI-7106 Project: Apache Hudi Issue Type: B

[jira] [Updated] (HUDI-7052) Fix partition key validation for key generators.

2023-11-08 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-7052: -- Summary: Fix partition key validation for key generators. (was: Fix partition key validation fo

[jira] [Created] (HUDI-7052) Fix partition key validation for custom payloads

2023-11-08 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-7052: - Summary: Fix partition key validation for custom payloads Key: HUDI-7052 URL: https://issues.apache.org/jira/browse/HUDI-7052 Project: Apache Hudi Issue Ty

[jira] [Assigned] (HUDI-7052) Fix partition key validation for custom payloads

2023-11-08 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-7052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-7052: - Assignee: Rajesh Mahindra > Fix partition key validation for custom payloads > --

[jira] [Created] (HUDI-6406) Pass in Spark Engine Context Wrapper for DeltaSync instead of spark engine context

2023-06-17 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-6406: - Summary: Pass in Spark Engine Context Wrapper for DeltaSync instead of spark engine context Key: HUDI-6406 URL: https://issues.apache.org/jira/browse/HUDI-6406 Proj

[jira] [Created] (HUDI-5255) Estimate the actual record size for the first ingestion batch instead of using default

2022-11-21 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-5255: - Summary: Estimate the actual record size for the first ingestion batch instead of using default Key: HUDI-5255 URL: https://issues.apache.org/jira/browse/HUDI-5255

[jira] [Created] (HUDI-4963) Extend InProcessLockProvider to support multiple table ingestion

2022-09-30 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-4963: - Summary: Extend InProcessLockProvider to support multiple table ingestion Key: HUDI-4963 URL: https://issues.apache.org/jira/browse/HUDI-4963 Project: Apache Hudi

[jira] [Created] (HUDI-4960) Upgrade Jetty version for Timeline server

2022-09-30 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-4960: - Summary: Upgrade Jetty version for Timeline server Key: HUDI-4960 URL: https://issues.apache.org/jira/browse/HUDI-4960 Project: Apache Hudi Issue Type: Imp

[jira] [Commented] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-09-19 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606735#comment-17606735 ] Rajesh Mahindra commented on HUDI-4430: --- [~alexey.kudinkin] Will help with this. >

[jira] [Assigned] (HUDI-4432) Checkpoint management for muti-writer scenario

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4432: - Assignee: Harshal Patil > Checkpoint management for muti-writer scenario > --

[jira] [Assigned] (HUDI-4452) Include hudi-aws to hudi-spark-bundle to fix cloudwatch reporter issue

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4452: - Assignee: Rahil Chertara > Include hudi-aws to hudi-spark-bundle to fix cloudwatch report

[jira] [Commented] (HUDI-4459) Corrupt parquet file created when syncing huge table with 4000+ fields,using hudi cow table with bulk_insert type

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571140#comment-17571140 ] Rajesh Mahindra commented on HUDI-4459: --- [~danny0405] can you help assign this ticke

[jira] [Assigned] (HUDI-4459) Corrupt parquet file created when syncing huge table with 4000+ fields,using hudi cow table with bulk_insert type

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4459: - Assignee: Danny Chen > Corrupt parquet file created when syncing huge table with 4000+ fi

[jira] [Assigned] (HUDI-4471) Relocate AWSDmsAvroPayload class to hudi-common

2022-07-25 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4471: - Assignee: Rahil Chertara > Relocate AWSDmsAvroPayload class to hudi-common >

[jira] [Assigned] (HUDI-4412) Multiple writers NPE when Insert_overwrite

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4412: - Assignee: liujinhui > Multiple writers NPE when Insert_overwrite > --

[jira] [Commented] (HUDI-4415) Support spark writer running on thrift server

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569838#comment-17569838 ] Rajesh Mahindra commented on HUDI-4415: --- [~minihippo] are you planning to work on it

[jira] [Updated] (HUDI-4418) Implement ProtoKafkaSource

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4418: -- Fix Version/s: 0.13.0 > Implement ProtoKafkaSource > -- > >

[jira] [Commented] (HUDI-4422) read parquet failed due to length is 0 or corrupt parquet file

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569836#comment-17569836 ] Rajesh Mahindra commented on HUDI-4422: --- [~JinxinTang] Feel free to raise the PR aft

[jira] [Assigned] (HUDI-4429) Make Spark 3.1.3 the default profile

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4429: - Assignee: Rahil Chertara > Make Spark 3.1.3 the default profile > --

[jira] [Comment Edited] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569833#comment-17569833 ] Rajesh Mahindra edited comment on HUDI-4430 at 7/22/22 6:30 AM:

[jira] [Commented] (HUDI-4430) Incorrect type casting while reading HUDI table created with CustomKeyGenerator and unixtimestamp paritioning field

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569833#comment-17569833 ] Rajesh Mahindra commented on HUDI-4430: --- Looks like your input column is of type str

[jira] [Assigned] (HUDI-4434) Disable EMRFS and EMR spark related properties

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4434: - Assignee: Rahil Chertara > Disable EMRFS and EMR spark related properties >

[jira] [Assigned] (HUDI-4440) Treat boostrapped table as non-partitioned in HudiFileIndex if partition column is missing from schema

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4440: - Assignee: Rahil Chertara > Treat boostrapped table as non-partitioned in HudiFileIndex if

[jira] [Assigned] (HUDI-4439) Fix Amazon CloudWatch reporter for metadata enabled tables

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4439: - Assignee: Rahil Chertara > Fix Amazon CloudWatch reporter for metadata enabled tables > -

[jira] [Updated] (HUDI-4442) Converting from json to avro does not sanitize field names

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4442: -- Fix Version/s: 0.12.0 > Converting from json to avro does not sanitize field names > ---

[jira] [Updated] (HUDI-4443) Add DeltaStreamer support for AWS managed Kafka (MSK)

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4443: -- Labels: blocker (was: ) > Add DeltaStreamer support for AWS managed Kafka (MSK) >

[jira] [Updated] (HUDI-4443) Add DeltaStreamer support for AWS managed Kafka (MSK)

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4443: -- Fix Version/s: 0.13.0 > Add DeltaStreamer support for AWS managed Kafka (MSK) > ---

[jira] [Updated] (HUDI-4445) Fix few things related to S3 Incremental Source

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-4445: -- Description: # Decode file resource url before operating on it. # Fix serializability of hadoop

[jira] [Assigned] (HUDI-4448) Remove the latest commit refresh for timeline server

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4448: - Assignee: Danny Chen > Remove the latest commit refresh for timeline server > ---

[jira] [Assigned] (HUDI-4450) Revert the checkpoint abort notification

2022-07-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-4450: - Assignee: Danny Chen > Revert the checkpoint abort notification > ---

[jira] [Commented] (HUDI-3941) Add extrametadata and commitActiontype to write callback

2022-04-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525836#comment-17525836 ] Rajesh Mahindra commented on HUDI-3941: --- [~shivnarayan] i have a local PR for this,

[jira] [Assigned] (HUDI-3941) Add extrametadata and commitActiontype to write callback

2022-04-21 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-3941: - Assignee: sivabalan narayanan > Add extrametadata and commitActiontype to write callback

[jira] [Created] (HUDI-3941) Add extrametadata and commitActiontype to write callback

2022-04-21 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-3941: - Summary: Add extrametadata and commitActiontype to write callback Key: HUDI-3941 URL: https://issues.apache.org/jira/browse/HUDI-3941 Project: Apache Hudi

[jira] [Updated] (HUDI-2854) Harden Toast support for Postgres

2022-03-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2854: -- Fix Version/s: 0.12.0 > Harden Toast support for Postgres > - >

[jira] [Assigned] (HUDI-2854) Harden Toast support for Postgres

2022-03-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra reassigned HUDI-2854: - Assignee: Rajesh Mahindra > Harden Toast support for Postgres > -

[jira] [Updated] (HUDI-2854) Harden Toast support for Postgres

2022-03-04 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2854: -- Priority: Blocker (was: Major) > Harden Toast support for Postgres > --

[jira] [Created] (HUDI-3562) Implement MoR table support for Debezium Source

2022-03-04 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-3562: - Summary: Implement MoR table support for Debezium Source Key: HUDI-3562 URL: https://issues.apache.org/jira/browse/HUDI-3562 Project: Apache Hudi Issue Typ

[jira] [Created] (HUDI-3557) Add support for Glue schema registry to deltastreamer and kafka sink connector

2022-03-03 Thread Rajesh Mahindra (Jira)
Rajesh Mahindra created HUDI-3557: - Summary: Add support for Glue schema registry to deltastreamer and kafka sink connector Key: HUDI-3557 URL: https://issues.apache.org/jira/browse/HUDI-3557 Project:

[jira] [Updated] (HUDI-3396) Make sure Spark reads only Projected Columns for both MOR/COW

2022-02-23 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3396: -- Status: Patch Available (was: In Progress) > Make sure Spark reads only Projected Columns for b

[jira] [Updated] (HUDI-3404) Disable metadata table by config with conditions

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3404: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3449) Async compaction cannot proceed due to archived deltacommit in metadata table

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3449: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Async compaction can

[jira] [Updated] (HUDI-3354) Rebase `HoodieRealtimeRecordReader` to return `HoodieRecord`

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3354: -- Sprint: Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hud

[jira] [Updated] (HUDI-3207) Hudi Trino connector PR review

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3207: -- Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Spr

[jira] [Updated] (HUDI-3466) Validate metadata table with col stats with long-running jobs

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3466: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Validate metadata ta

[jira] [Updated] (HUDI-3457) Refactor Spark Relations to avoid code duplication

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3457: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Refactor Spark Relat

[jira] [Updated] (HUDI-1127) Handling late arriving Deletes

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-1127: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-3368) Support metadata bloom index for secondary keys

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3368: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3284) Restore hudi-presto-bundle changes and upgrade presto version in docker setup

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3284: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-3381) Rebase `HoodieMergeHandle` to operate on `HoodieRecord`

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3381: -- Sprint: Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hud

[jira] [Updated] (HUDI-2955) Upgrade Hadoop to 3.3.x

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2955: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Upgrade Hadoop to 3.

[jira] [Updated] (HUDI-2439) Refactor table.action.commit package (CommitActionExecutors) in hudi-client module

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2439: -- Sprint: Hudi-Sprint-Jan-18, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Jan-18, Hu

[jira] [Updated] (HUDI-3356) Conversion of write stats to metadata index records should use HoodieData throughout

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3356: -- Sprint: Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hud

[jira] [Updated] (HUDI-3382) Support removal of bloom and column stats indexes

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3382: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3088) Make Spark 3 the default profile for build and test

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3088: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-2732) Spark Datasource V2 integration RFC

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2732: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-3175) Support INDEX action for async metadata indexing

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3175: -- Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Spr

[jira] [Updated] (HUDI-3142) Metadata new Indices initialization during table creation

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3142: -- Sprint: Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hud

[jira] [Updated] (HUDI-2757) Support AWS Glue API for metastore sync

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2757: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3341) Investigate that metadata table cannot be read for hadoop-aws 2.7.x

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3341: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-3246) Blog on Kafka Connect Sink for Hudi

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3246: -- Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Spr

[jira] [Updated] (HUDI-3074) Docs for Z-order

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3074: -- Sprint: Hudi-Sprint-Jan-3, Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Spri

[jira] [Updated] (HUDI-2965) Fix layout optimization to appropriately handle nested columns references

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-2965: -- Sprint: Hudi-Sprint-Jan-3, Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Spri

[jira] [Updated] (HUDI-3208) Come up with rollout plan for enabling metadata table by default in 0.11

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3208: -- Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Spr

[jira] [Updated] (HUDI-1623) Support start_commit_time & end_commit_times for serializable incremental pull

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-1623: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Support start_commit

[jira] [Updated] (HUDI-3161) Add Call Produce Command for spark sql

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3161: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3396) Make sure Spark reads only Projected Columns for both MOR/COW

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3396: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3225) RFC for Async Metadata Index

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3225: -- Sprint: Hudi-Sprint-Jan-10, Hudi-Sprint-Jan-18, Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Spr

[jira] [Updated] (HUDI-3221) Support querying a table as of a savepoint

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3221: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-3258) Support multiple metadata index partitions - bloom and column stats

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3258: -- Sprint: Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-7, Hudi

[jira] [Updated] (HUDI-3465) Add col stats validation in HoodieMetadataTableValidator

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3465: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Add col stats valida

[jira] [Updated] (HUDI-3203) Meta bloom index should use the bloom filter type property to construct back the bloom filter instant

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3203: -- Sprint: Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hud

[jira] [Updated] (HUDI-3349) Revisit HoodieRecord API to be able to replace HoodieRecordPayload

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3349: -- Sprint: Hudi-Sprint-Jan-24, Hudi-Sprint-Jan-31, Hudi-Sprint-Feb-7, Hudi-Sprint-Feb-14, Hudi-Spri

[jira] [Updated] (HUDI-3443) Async clustering cannot proceed due to archived deltacommit in metadata table

2022-02-22 Thread Rajesh Mahindra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Mahindra updated HUDI-3443: -- Sprint: Hudi-Sprint-Feb-14, Hudi-Sprint-Feb-22 (was: Hudi-Sprint-Feb-14) > Async clustering can

  1   2   3   4   5   >