[GitHub] [hudi] RajasekarSribalan commented on issue #1794: [SUPPORT] Hudi delete operation but HiveSync failed

2020-07-06 Thread GitBox
RajasekarSribalan commented on issue #1794: URL: https://github.com/apache/hudi/issues/1794#issuecomment-654588991 @bhasudha just an update, our jobs are not failing but we get this error for hard delete operation and below is the command that we use on dataframe for delete operation.

[GitHub] [hudi] RajasekarSribalan edited a comment on issue #1794: [SUPPORT] Hudi delete operation but HiveSync failed

2020-07-06 Thread GitBox
RajasekarSribalan edited a comment on issue #1794: URL: https://github.com/apache/hudi/issues/1794#issuecomment-654588991 @bhasudha just an update, our jobs are not failing but we get this error for hard delete operation and below is the command that we use on dataframe for delete

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #331

2020-07-06 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.35 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml

[jira] [Commented] (HUDI-691) hoodie.*.consume.* should be set whitelist in hive-site.xml

2020-07-06 Thread GarudaGuo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152462#comment-17152462 ] GarudaGuo commented on HUDI-691: [~bhavanisudha] which doc should I append the issue. thx. >

[GitHub] [hudi] zherenyu831 edited a comment on issue #1798: Question reading partition path with less level is more faster than what document mentioned

2020-07-06 Thread GitBox
zherenyu831 edited a comment on issue #1798: URL: https://github.com/apache/hudi/issues/1798#issuecomment-654575639 @bhasudha It is a very simple query for testing ``` //val df = spark.read.format("org.apache.hudi").load("s3://test/data/*/*/*/*") val df =

[GitHub] [hudi] zherenyu831 edited a comment on issue #1798: Question reading partition path with less level is more faster than what document mentioned

2020-07-06 Thread GitBox
zherenyu831 edited a comment on issue #1798: URL: https://github.com/apache/hudi/issues/1798#issuecomment-654575639 @bhasudha It is a very simple query for testing ``` //val df = spark.read.format("org.apache.hudi").load("s3://test/data/*/*/*/*") val df =

[GitHub] [hudi] zherenyu831 edited a comment on issue #1798: Question reading partition path with less level is more faster than what document mentioned

2020-07-06 Thread GitBox
zherenyu831 edited a comment on issue #1798: URL: https://github.com/apache/hudi/issues/1798#issuecomment-654575639 @bhasudha It is very simple query ``` //val df = spark.read.format("org.apache.hudi").load("s3://test/data/*/*/*/*") val df =

[GitHub] [hudi] zherenyu831 edited a comment on issue #1798: Question reading partition path with less level is more faster than what document mentioned

2020-07-06 Thread GitBox
zherenyu831 edited a comment on issue #1798: URL: https://github.com/apache/hudi/issues/1798#issuecomment-654575639 @bhasudha It is a very simple query ``` //val df = spark.read.format("org.apache.hudi").load("s3://test/data/*/*/*/*") val df =

[GitHub] [hudi] zherenyu831 commented on issue #1798: Question reading partition path with less level is more faster than what document mentioned

2020-07-06 Thread GitBox
zherenyu831 commented on issue #1798: URL: https://github.com/apache/hudi/issues/1798#issuecomment-654575639 @bhasudha It is very simple query ``` val df = spark.read.format("org.apache.hudi").load("s3://test/data/*/*/*") val updatedDf = df.filter("_hoodie_commit_time between

[GitHub] [hudi] garyli1019 commented on a change in pull request #1722: [HUDI-69] Support Spark Datasource for MOR table

2020-07-06 Thread GitBox
garyli1019 commented on a change in pull request #1722: URL: https://github.com/apache/hudi/pull/1722#discussion_r450584388 ## File path: hudi-spark/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieMergedParquetRowIterator.scala ## @@ -0,0 +1,178 @@ +/*

[GitHub] [hudi] garyli1019 commented on a change in pull request #1722: [HUDI-69] Support Spark Datasource for MOR table

2020-07-06 Thread GitBox
garyli1019 commented on a change in pull request #1722: URL: https://github.com/apache/hudi/pull/1722#discussion_r450584271 ## File path: hudi-spark/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieParquetRealtimeFileFormat.scala ## @@ -0,0 +1,188 @@

[GitHub] [hudi] zuyanton commented on issue #1790: [SUPPORT] Querying MoR tables with DecimalType columns via Spark SQL fails

2020-07-06 Thread GitBox
zuyanton commented on issue #1790: URL: https://github.com/apache/hudi/issues/1790#issuecomment-654566959 @bhasudha Thank you for your reply . If I read code correctly ,I believe that handling decimal is missing here

[GitHub] [hudi] prashanthpdesai commented on issue #1775: INCREMETNAL QUERY-Null value Exception

2020-07-06 Thread GitBox
prashanthpdesai commented on issue #1775: URL: https://github.com/apache/hudi/issues/1775#issuecomment-654559192 @bhasudha : Hi , tried with same packages which you have mentioned above , we see diff kind of error . Please find the trace below . **spark-shell --queue queue_q1

[jira] [Commented] (HUDI-979) AWSDMSPayload delete handling with MOR

2020-07-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152414#comment-17152414 ] sivabalan narayanan commented on HUDI-979: -- [~309637554] : sure. go ahead :+1:  > AWSDMSPayload

[jira] [Assigned] (HUDI-979) AWSDMSPayload delete handling with MOR

2020-07-06 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-979: Assignee: liwei (was: sivabalan narayanan) > AWSDMSPayload delete handling with

[jira] [Commented] (HUDI-979) AWSDMSPayload delete handling with MOR

2020-07-06 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152411#comment-17152411 ] liwei commented on HUDI-979: [~xleesf] [~shivnarayan]  i can do this issue. we have fix in our inner branch >

[GitHub] [hudi] tooptoop4 commented on issue #1802: [SUPPORT] Delete gives Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file

2020-07-06 Thread GitBox
tooptoop4 commented on issue #1802: URL: https://github.com/apache/hudi/issues/1802#issuecomment-654539478 int64 from upsert but binary from delete. once binary input data had int64 it worked This is an automated message

[GitHub] [hudi] tooptoop4 closed issue #1802: [SUPPORT] Delete gives Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file

2020-07-06 Thread GitBox
tooptoop4 closed issue #1802: URL: https://github.com/apache/hudi/issues/1802 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] tooptoop4 closed issue #1799: [SUPPORT] NPE at org.apache.hudi.table.HoodieCommitArchiveLog.lambda$getInstantsToArchive

2020-07-06 Thread GitBox
tooptoop4 closed issue #1799: URL: https://github.com/apache/hudi/issues/1799 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] tooptoop4 commented on issue #1801: [SUPPORT] org.apache.avro.AvroTypeException: Found com.uber.hoodie.avro.model.HoodieCleanMetadata, expecting org.apache.hudi.avro.model.HoodieCleane

2020-07-06 Thread GitBox
tooptoop4 commented on issue #1801: URL: https://github.com/apache/hudi/issues/1801#issuecomment-654534604 just warning, closing This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [hudi] tooptoop4 closed issue #1801: [SUPPORT] org.apache.avro.AvroTypeException: Found com.uber.hoodie.avro.model.HoodieCleanMetadata, expecting org.apache.hudi.avro.model.HoodieCleanerPlan,

2020-07-06 Thread GitBox
tooptoop4 closed issue #1801: URL: https://github.com/apache/hudi/issues/1801 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Assigned] (HUDI-472) Make sortBy() inside bulkInsertInternal() configurable for bulk_insert

2020-07-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-472: --- Assignee: Ethan Guo (was: sivabalan narayanan) > Make sortBy() inside bulkInsertInternal()

[jira] [Assigned] (HUDI-1014) Design and Implement upgrade-downgrade infrastrucutre

2020-07-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar reassigned HUDI-1014: Assignee: sivabalan narayanan (was: Balaji Varadarajan) > Design and Implement

[jira] [Updated] (HUDI-802) AWSDmsTransformer does not handle insert -> delete of a row in a single batch correctly

2020-07-06 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-802: Status: Patch Available (was: In Progress) > AWSDmsTransformer does not handle insert -> delete of

[GitHub] [hudi] bhasudha commented on issue #1803: [SUPPORT] hoodie.datasource.write.precombine.field is ignored

2020-07-06 Thread GitBox
bhasudha commented on issue #1803: URL: https://github.com/apache/hudi/issues/1803#issuecomment-654521251 @joaqs190 quick questions: 1. could you describe what is the precombine field here ? 2. Hudi uses two way of writing - Spark datasource writer and Deltastreamer. For

[jira] [Updated] (HUDI-1054) Address performance issues with finalizing writes on S3

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1054: - Status: Patch Available (was: In Progress) > Address performance issues with finalizing

[jira] [Updated] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1021: - Status: Open (was: New) > [Bug] Unable to update bootstrapped table using rows from the

[jira] [Updated] (HUDI-1054) Address performance issues with finalizing writes on S3

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1054: - Status: Open (was: New) > Address performance issues with finalizing writes on S3 >

[jira] [Updated] (HUDI-999) Parallelize listing of Source dataset partitions

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-999: Status: In Progress (was: Open) > Parallelize listing of Source dataset partitions >

[jira] [Updated] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1021: - Fix Version/s: 0.6.0 > [Bug] Unable to update bootstrapped table using rows from the

[jira] [Updated] (HUDI-1021) [Bug] Unable to update bootstrapped table using rows from the written bootstrapped table

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1021: - Priority: Blocker (was: Major) > [Bug] Unable to update bootstrapped table using rows

[jira] [Updated] (HUDI-1054) Address performance issues with finalizing writes on S3

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1054: - Status: In Progress (was: Open) > Address performance issues with finalizing writes on

[jira] [Resolved] (HUDI-991) Bootstrap Implementation Bugs

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-991. - Resolution: Duplicate > Bootstrap Implementation Bugs > - > >

[jira] [Assigned] (HUDI-992) For hive-style partitioned source data, partition columns synced with Hive will always have String type

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-992: --- Assignee: Udit Mehrotra > For hive-style partitioned source data, partition columns

[jira] [Commented] (HUDI-991) Bootstrap Implementation Bugs

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152373#comment-17152373 ] Balaji Varadarajan commented on HUDI-991: - This is a umbrella ticket which is not needed. Closing

[jira] [Updated] (HUDI-1001) Add implementation to translate source partition paths when doing metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-1001: - Fix Version/s: (was: 0.6.0) 0.6.1 > Add implementation to

[jira] [Updated] (HUDI-971) Fix HFileBootstrapIndexReader.getIndexedPartitions() returns unclean partition name

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-971: Status: Open (was: New) > Fix HFileBootstrapIndexReader.getIndexedPartitions() returns

[jira] [Updated] (HUDI-619) Investigate and implement mechanism to have hive/presto/sparksql queries avoid stitching and return null values for hoodie columns

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-619: Fix Version/s: (was: 0.6.0) 0.6.1 > Investigate and implement

[jira] [Updated] (HUDI-808) Support for cleaning source data

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-808: Priority: Blocker (was: Major) > Support for cleaning source data >

[jira] [Updated] (HUDI-991) Bootstrap Implementation Bugs

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-991: Status: Open (was: New) > Bootstrap Implementation Bugs > - > >

[jira] [Assigned] (HUDI-971) Fix HFileBootstrapIndexReader.getIndexedPartitions() returns unclean partition name

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-971: --- Assignee: Balaji Varadarajan > Fix HFileBootstrapIndexReader.getIndexedPartitions()

[jira] [Updated] (HUDI-954) Test COW : Presto Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-954: Fix Version/s: (was: 0.6.0) 0.6.1 > Test COW : Presto Read Optimized

[jira] [Updated] (HUDI-808) Support for cleaning source data

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-808: Status: Open (was: New) > Support for cleaning source data >

[jira] [Assigned] (HUDI-808) Support for cleaning source data

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-808: --- Assignee: Wenning Ding (was: Udit Mehrotra) > Support for cleaning source data >

[jira] [Updated] (HUDI-956) Test COW : Presto Realtime Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-956: Fix Version/s: (was: 0.6.0) 0.6.1 > Test COW : Presto Realtime Query

[jira] [Updated] (HUDI-955) Test MOR : Presto Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-955: Fix Version/s: (was: 0.6.0) 0.6.1 > Test MOR : Presto Read Optimized

[jira] [Updated] (HUDI-808) Support for cleaning source data

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-808: Fix Version/s: 0.6.0 > Support for cleaning source data > >

[jira] [Updated] (HUDI-621) Presto Integration for supporting Bootstrapped table

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-621: Fix Version/s: 0.6.1 > Presto Integration for supporting Bootstrapped table >

[jira] [Updated] (HUDI-828) Open Questions before merging Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-828: Fix Version/s: 0.6.0 > Open Questions before merging Bootstrap >

[jira] [Resolved] (HUDI-828) Open Questions before merging Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-828. - Resolution: Fixed > Open Questions before merging Bootstrap >

[jira] [Resolved] (HUDI-828) Open Questions before merging Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-828. - Resolution: Duplicate > Open Questions before merging Bootstrap >

[jira] [Reopened] (HUDI-828) Open Questions before merging Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reopened HUDI-828: - > Open Questions before merging Bootstrap > > >

[jira] [Updated] (HUDI-828) Open Questions before merging Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-828: Status: In Progress (was: Open) > Open Questions before merging Bootstrap >

[jira] [Updated] (HUDI-428) Web documentation for explaining how to bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-428: Priority: Blocker (was: Major) > Web documentation for explaining how to bootstrap >

[jira] [Updated] (HUDI-953) Test COW : Spark Data Source Read Optimized Queries

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-953: Status: In Progress (was: Open) > Test COW : Spark Data Source Read Optimized Queries >

[jira] [Commented] (HUDI-828) Open Questions before merging Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152371#comment-17152371 ] Balaji Varadarajan commented on HUDI-828: - This is part of the PR now and has been discussed. >

[jira] [Updated] (HUDI-953) Test COW : Spark Data Source Read Optimized Queries

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-953: Fix Version/s: 0.60 > Test COW : Spark Data Source Read Optimized Queries >

[jira] [Updated] (HUDI-953) Test COW : Spark Data Source Read Optimized Queries

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-953: Priority: Blocker (was: Major) > Test COW : Spark Data Source Read Optimized Queries >

[jira] [Resolved] (HUDI-953) Test COW : Spark Data Source Read Optimized Queries

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-953. - Resolution: Fixed > Test COW : Spark Data Source Read Optimized Queries >

[jira] [Updated] (HUDI-950) Test COW : Spark SQL Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-950: Priority: Blocker (was: Major) > Test COW : Spark SQL Read Optimized Query with metadata

[jira] [Updated] (HUDI-949) Test MOR : Hive Realtime Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-949: Priority: Blocker (was: Major) > Test MOR : Hive Realtime Query with metadata bootstrap >

[jira] [Updated] (HUDI-950) Test COW : Spark SQL Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-950: Fix Version/s: 0.6.0 > Test COW : Spark SQL Read Optimized Query with metadata bootstrap >

[jira] [Updated] (HUDI-952) Test MOR : Spark SQL Realtime Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-952: Priority: Blocker (was: Major) > Test MOR : Spark SQL Realtime Query with metadata

[jira] [Updated] (HUDI-949) Test MOR : Hive Realtime Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-949: Fix Version/s: 0.6.0 > Test MOR : Hive Realtime Query with metadata bootstrap >

[jira] [Updated] (HUDI-951) Test MOR : Spark SQL Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-951: Priority: Blocker (was: Major) > Test MOR : Spark SQL Read Optimized Query with metadata

[jira] [Updated] (HUDI-951) Test MOR : Spark SQL Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-951: Fix Version/s: 0.6.0 > Test MOR : Spark SQL Read Optimized Query with metadata bootstrap >

[jira] [Updated] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-915: Priority: Blocker (was: Major) > Partition Columns missing in files upserted after Metadata

[jira] [Updated] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-915: Fix Version/s: 0.6.0 > Partition Columns missing in files upserted after Metadata Bootstrap

[jira] [Updated] (HUDI-952) Test MOR : Spark SQL Realtime Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-952: Fix Version/s: 0.6.0 > Test MOR : Spark SQL Realtime Query with metadata bootstrap >

[jira] [Updated] (HUDI-948) Test MOR : Hive Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-948: Priority: Blocker (was: Major) > Test MOR : Hive Read Optimized Query with metadata

[jira] [Updated] (HUDI-947) Test COW : Hive Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-947: Fix Version/s: 0.6.0 > Test COW : Hive Read Optimized Query with metadata bootstrap >

[jira] [Updated] (HUDI-948) Test MOR : Hive Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-948: Fix Version/s: 0.6.0 > Test MOR : Hive Read Optimized Query with metadata bootstrap >

[jira] [Updated] (HUDI-947) Test COW : Hive Read Optimized Query with metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-947: Priority: Blocker (was: Major) > Test COW : Hive Read Optimized Query with metadata

[jira] [Updated] (HUDI-900) Metadata Bootstrap Key Generator needs to handle complex keys correctly

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-900: Priority: Blocker (was: Major) > Metadata Bootstrap Key Generator needs to handle complex

[jira] [Updated] (HUDI-946) Metadata Bootstrap Query Testing Master TIcket

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-946: Priority: Blocker (was: Major) > Metadata Bootstrap Query Testing Master TIcket >

[jira] [Updated] (HUDI-899) Add a knob to change partition-path style while performing metadata bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-899: Fix Version/s: (was: 0.6.0) 0.6.1 > Add a knob to change

[jira] [Resolved] (HUDI-429) Long Running Testing to certify Bootstrapping

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan resolved HUDI-429. - Resolution: Fixed > Long Running Testing to certify Bootstrapping >

[jira] [Updated] (HUDI-429) Long Running Testing to certify Bootstrapping

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-429: Priority: Blocker (was: Major) > Long Running Testing to certify Bootstrapping >

[jira] [Updated] (HUDI-427) Implement CLI support for performing bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-427: Priority: Blocker (was: Major) > Implement CLI support for performing bootstrap >

[jira] [Updated] (HUDI-620) Hive Sync Integration of bootstrapped table

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-620: Priority: Blocker (was: Major) > Hive Sync Integration of bootstrapped table >

[jira] [Updated] (HUDI-418) Bootstrap Index - Implementation

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-418: Priority: Blocker (was: Major) > Bootstrap Index - Implementation >

[jira] [Updated] (HUDI-420) Automated end to end Integration Test

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-420: Priority: Blocker (was: Major) > Automated end to end Integration Test >

[jira] [Updated] (HUDI-422) Cleanup bootstrap code and create write APIs for supporting bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-422: Priority: Blocker (was: Major) > Cleanup bootstrap code and create write APIs for

[jira] [Updated] (HUDI-424) Implement Hive Query Side Integration for querying tables containing bootstrap file slices

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-424: Priority: Blocker (was: Major) > Implement Hive Query Side Integration for querying tables

[jira] [Updated] (HUDI-426) Implement Spark DataSource Support for querying bootstrapped tables

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-426: Priority: Blocker (was: Major) > Implement Spark DataSource Support for querying

[jira] [Updated] (HUDI-423) Implement upsert functionality for handling updates to these bootstrap file slices

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-423: Priority: Blocker (was: Major) > Implement upsert functionality for handling updates to

[jira] [Updated] (HUDI-425) Implement support for bootstrapping in HoodieDeltaStreamer

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-425: Priority: Blocker (was: Major) > Implement support for bootstrapping in HoodieDeltaStreamer

[jira] [Updated] (HUDI-421) Cleanup bootstrap code and create PR for FileStystemView changes

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-421: Priority: Blocker (was: Major) > Cleanup bootstrap code and create PR for FileStystemView

[jira] [Updated] (HUDI-417) Refactor HoodieWriteClient so that commit logic can be shareable by both bootstrap and normal write operations

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-417: Priority: Blocker (was: Major) > Refactor HoodieWriteClient so that commit logic can be

[jira] [Updated] (HUDI-419) Basic Implementation for verifying if bootstrapping works end to end

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-419: Priority: Blocker (was: Major) > Basic Implementation for verifying if bootstrapping works

[jira] [Created] (HUDI-1077) Integration tests to validate clustering

2020-07-06 Thread satish (Jira)
satish created HUDI-1077: Summary: Integration tests to validate clustering Key: HUDI-1077 URL: https://issues.apache.org/jira/browse/HUDI-1077 Project: Apache Hudi Issue Type: Sub-task

[jira] [Created] (HUDI-1076) CLI tools to support clustering

2020-07-06 Thread satish (Jira)
satish created HUDI-1076: Summary: CLI tools to support clustering Key: HUDI-1076 URL: https://issues.apache.org/jira/browse/HUDI-1076 Project: Apache Hudi Issue Type: Sub-task Reporter:

[jira] [Created] (HUDI-1075) Implement a simple merge clustering strategy

2020-07-06 Thread satish (Jira)
satish created HUDI-1075: Summary: Implement a simple merge clustering strategy Key: HUDI-1075 URL: https://issues.apache.org/jira/browse/HUDI-1075 Project: Apache Hudi Issue Type: Sub-task

[jira] [Created] (HUDI-1074) implement merge-sort based clustering strategy

2020-07-06 Thread satish (Jira)
satish created HUDI-1074: Summary: implement merge-sort based clustering strategy Key: HUDI-1074 URL: https://issues.apache.org/jira/browse/HUDI-1074 Project: Apache Hudi Issue Type: Sub-task

[jira] [Created] (HUDI-1073) Implement skeleton to support multiple clustering strategies

2020-07-06 Thread satish (Jira)
satish created HUDI-1073: Summary: Implement skeleton to support multiple clustering strategies Key: HUDI-1073 URL: https://issues.apache.org/jira/browse/HUDI-1073 Project: Apache Hudi Issue Type:

[jira] [Created] (HUDI-1072) Reader changes to support clustering and insert overwrite

2020-07-06 Thread satish (Jira)
satish created HUDI-1072: Summary: Reader changes to support clustering and insert overwrite Key: HUDI-1072 URL: https://issues.apache.org/jira/browse/HUDI-1072 Project: Apache Hudi Issue Type:

[jira] [Assigned] (HUDI-1072) Reader changes to support clustering and insert overwrite

2020-07-06 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish reassigned HUDI-1072: Assignee: satish > Reader changes to support clustering and insert overwrite >

[jira] [Assigned] (HUDI-915) Partition Columns missing in files upserted after Metadata Bootstrap

2020-07-06 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-915: --- Assignee: Udit Mehrotra (was: Balaji Varadarajan) > Partition Columns missing in

[jira] [Updated] (HUDI-960) HFile Support for HUDI

2020-07-06 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-960: Labels: pull-request-available (was: ) > HFile Support for HUDI > -- > >

[GitHub] [hudi] prashantwason opened a new pull request #1804: [HUDI-960] Implementation of the HFile base and log file format.

2020-07-06 Thread GitBox
prashantwason opened a new pull request #1804: URL: https://github.com/apache/hudi/pull/1804 1. Includes HFileWriter and HFileReader 2. Includes HFileInputFormat for both snapshot and realtime input format for Hive 3. Unit test for new code 4. IT for using HFile format and

  1   2   >