[jira] [Closed] (HUDI-2451) HoodieTableMetaClient The file separator from Window to HDFS is faulty

2021-09-29 Thread yao.zhou (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yao.zhou closed HUDI-2451. -- > HoodieTableMetaClient The file separator from Window to HDFS is faulty >

[jira] [Assigned] (HUDI-2451) HoodieTableMetaClient The file separator from Window to HDFS is faulty

2021-09-29 Thread yao.zhou (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yao.zhou reassigned HUDI-2451: -- Assignee: yao.zhou > HoodieTableMetaClient The file separator from Window to HDFS is faulty >

[GitHub] [hudi] hudi-bot edited a comment on pull request #3674: [HUDI-2440] Add dependency change diff script for dependency governace

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3674: URL: https://github.com/apache/hudi/pull/3674#issuecomment-920690239 ## CI report: * 7c634fd5d815d4643732d0c26144171c9dd8e64c Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3674: [HUDI-2440] Add dependency change diff script for dependency governace

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3674: URL: https://github.com/apache/hudi/pull/3674#issuecomment-920690239 ## CI report: * 7c634fd5d815d4643732d0c26144171c9dd8e64c Azure:

[GitHub] [hudi] tangyoupeng commented on a change in pull request #3736: Add jfs support doc for hudi

2021-09-29 Thread GitBox
tangyoupeng commented on a change in pull request #3736: URL: https://github.com/apache/hudi/pull/3736#discussion_r719061631 ## File path: website/docs/jfs_hoodie.md ## @@ -0,0 +1,90 @@ +--- +title: JuiceFS keywords: [ hudi, hive, jfs, spark, flink] +summary: On this page, we

[GitHub] [hudi] tangyoupeng commented on a change in pull request #3736: Add jfs support doc for hudi

2021-09-29 Thread GitBox
tangyoupeng commented on a change in pull request #3736: URL: https://github.com/apache/hudi/pull/3736#discussion_r719061631 ## File path: website/docs/jfs_hoodie.md ## @@ -0,0 +1,90 @@ +--- +title: JuiceFS keywords: [ hudi, hive, jfs, spark, flink] +summary: On this page, we

[GitHub] [hudi] leesf commented on a change in pull request #3736: Add jfs support doc for hudi

2021-09-29 Thread GitBox
leesf commented on a change in pull request #3736: URL: https://github.com/apache/hudi/pull/3736#discussion_r719056860 ## File path: website/docs/jfs_hoodie.md ## @@ -0,0 +1,90 @@ +--- +title: JuiceFS keywords: [ hudi, hive, jfs, spark, flink] +summary: On this page, we go

[hudi] branch master updated (2f07e12 -> def08d7)

2021-09-29 Thread leesf
This is an automated email from the ASF dual-hosted git repository. leesf pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 2f07e12 [MINOR] Fix typo Hooodie corrected to Hoodie & reuqired corrected to required (#3730) add def08d7

[GitHub] [hudi] leesf merged pull request #3729: Support JuiceFileSystem

2021-09-29 Thread GitBox
leesf merged pull request #3729: URL: https://github.com/apache/hudi/pull/3729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] mauropelucchi commented on issue #2564: Hoodie clean is not deleting old files

2021-09-29 Thread GitBox
mauropelucchi commented on issue #2564: URL: https://github.com/apache/hudi/issues/2564#issuecomment-930774206 > @mauropelucchi : curious as to your table type choice. I see you are setting max delta commits to compact to 1. So, you might as well choose COW to easier operability.

[jira] [Assigned] (HUDI-2482) Support drop partitions SQL

2021-09-29 Thread Yann Byron (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yann Byron reassigned HUDI-2482: Assignee: Yann Byron > Support drop partitions SQL > --- > >

[jira] [Created] (HUDI-2503) HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service

2021-09-29 Thread Nicholas Jiang (Jira)
Nicholas Jiang created HUDI-2503: Summary: HoodieFlinkWriteClient supports to allow parallel writing to tables using Locking service Key: HUDI-2503 URL: https://issues.apache.org/jira/browse/HUDI-2503

[GitHub] [hudi] tangyoupeng commented on pull request #3736: Add jfs support doc for hudi

2021-09-29 Thread GitBox
tangyoupeng commented on pull request #3736: URL: https://github.com/apache/hudi/pull/3736#issuecomment-930715835 #3729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [hudi] tangyoupeng commented on a change in pull request #3729: Support JuiceFileSystem

2021-09-29 Thread GitBox
tangyoupeng commented on a change in pull request #3729: URL: https://github.com/apache/hudi/pull/3729#discussion_r719021678 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/StorageSchemes.java ## @@ -62,6 +62,8 @@ OBS("obs", false), // Kingsoft

[GitHub] [hudi] tangyoupeng opened a new pull request #3736: Add jfs support doc for hudi

2021-09-29 Thread GitBox
tangyoupeng opened a new pull request #3736: URL: https://github.com/apache/hudi/pull/3736 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.* ## What is the

[GitHub] [hudi] peanut-chenzhong commented on issue #3735: [SUPPORT] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-09-29 Thread GitBox
peanut-chenzhong commented on issue #3735: URL: https://github.com/apache/hudi/issues/3735#issuecomment-930703026 @n3nash could you kindly help check this is an issue? If yes I can rise an PR to solve it soom. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] peanut-chenzhong opened a new issue #3735: [SUPPORT] OverwriteNonDefaultsWithLatestAvroPayload doesn`t work when upsert data with some null value column

2021-09-29 Thread GitBox
peanut-chenzhong opened a new issue #3735: URL: https://github.com/apache/hudi/issues/3735 For my understanding, if we using OverwriteNonDefaultsWithLatestAvroPayload, Hudi will update column by comlun. If the upsert data has some column which is null, Hudi will ignore these columns and

[GitHub] [hudi] hudi-bot edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-919855741 ## CI report: * a423a63a9a530365b531a4199a5010dd52708d86 Azure:

[hudi] branch master updated (dd1bd62 -> 2f07e12)

2021-09-29 Thread vinoyang
This is an automated email from the ASF dual-hosted git repository. vinoyang pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from dd1bd62 [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource (#3413) add 2f07e12

[GitHub] [hudi] yanghua merged pull request #3730: [MINOR] Fix typo Hooodie corrected to Hoodie & reuqired corrected to required

2021-09-29 Thread GitBox
yanghua merged pull request #3730: URL: https://github.com/apache/hudi/pull/3730 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 Azure:

[GitHub] [hudi] hudi-bot edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-919855741 ## CI report: * a423a63a9a530365b531a4199a5010dd52708d86 Azure:

[GitHub] [hudi] xiarixiaoyao commented on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution

2021-09-29 Thread GitBox
xiarixiaoyao commented on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-930677502 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [hudi] hudi-bot edited a comment on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 Azure:

[GitHub] [hudi] yihua commented on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
yihua commented on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930657396 @hudi-bot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [hudi] hudi-bot edited a comment on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 Azure:

[GitHub] [hudi] nsivabalan commented on a change in pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
nsivabalan commented on a change in pull request #3727: URL: https://github.com/apache/hudi/pull/3727#discussion_r718849052 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/CompactHelpers.java ## @@ -0,0 +1,178 @@ +/* + *

[GitHub] [hudi] yanghua commented on pull request #3671: [HUDI-2418] add HiveSchemaProvider

2021-09-29 Thread GitBox
yanghua commented on pull request #3671: URL: https://github.com/apache/hudi/pull/3671#issuecomment-929787041 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot edited a comment on pull request #3729: Support JuiceFileSystem

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3729: URL: https://github.com/apache/hudi/pull/3729#issuecomment-929788240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #3729: Support JuiceFileSystem

2021-09-29 Thread GitBox
hudi-bot commented on pull request #3729: URL: https://github.com/apache/hudi/pull/3729#issuecomment-929788240 ## CI report: * c66e724bfa44a5147c8965bb585d07ac3b36 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] xushiyan merged pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

2021-09-29 Thread GitBox
xushiyan merged pull request #3413: URL: https://github.com/apache/hudi/pull/3413 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] codope commented on issue #3242: Hive sync error by using run_sync_tool.sh JDODataStoreException: Error executing SQL query "select "DB_ID" from "DBS""

2021-09-29 Thread GitBox
codope commented on issue #3242: URL: https://github.com/apache/hudi/issues/3242#issuecomment-930129481 > @moranyuwen can you solve this problem? I have this problem, but I can not solve it. I don't know how set mysql database instead derby, in Hudi config in spark code. @niloo-sh

[GitHub] [hudi] xiarixiaoyao commented on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution

2021-09-29 Thread GitBox
xiarixiaoyao commented on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-930001505 @bvaradar @codope @leesf . could you pls help me review this pr again, thanks code changes 1) support mor(incremental/realtime/optimize) read/write 2) support cow

[GitHub] [hudi] stym06 commented on issue #2688: [SUPPORT] Sync to Hive using Metastore

2021-09-29 Thread GitBox
stym06 commented on issue #2688: URL: https://github.com/apache/hudi/issues/2688#issuecomment-929828308 hi, i made it to work with Hive 3.1.2 after importing some jars into the classpath after finding out the classes not found (majorly calcite, datanucleus) -- This is an automated

[GitHub] [hudi] xushiyan closed issue #3641: [SUPPORT] Retrieving latest completed commit timestamp via HoodieTableMetaClient in PySpark

2021-09-29 Thread GitBox
xushiyan closed issue #3641: URL: https://github.com/apache/hudi/issues/3641 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] davehagman commented on issue #3733: [SUPPORT] Periodic and sustained latency spikes during index lookup

2021-09-29 Thread GitBox
davehagman commented on issue #3733: URL: https://github.com/apache/hudi/issues/3733#issuecomment-930488974 I would also like to show some graphs comparing the baseline and what they look like during these latency spikes. This is a graph of the number of files Created vs. Updated

[GitHub] [hudi] xushiyan commented on issue #3641: [SUPPORT] Retrieving latest completed commit timestamp via HoodieTableMetaClient in PySpark

2021-09-29 Thread GitBox
xushiyan commented on issue #3641: URL: https://github.com/apache/hudi/issues/3641#issuecomment-929916500 Ok @bryanburke i think your approach is valid. metaclient APIs should be quite stable and even if in case of change, there should be a deprecation period to allow transition. You may

[GitHub] [hudi] nsivabalan commented on issue #2564: Hoodie clean is not deleting old files

2021-09-29 Thread GitBox
nsivabalan commented on issue #2564: URL: https://github.com/apache/hudi/issues/2564#issuecomment-930088089 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [hudi] xushiyan commented on a change in pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

2021-09-29 Thread GitBox
xushiyan commented on a change in pull request #3413: URL: https://github.com/apache/hudi/pull/3413#discussion_r718245885 ## File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java ## @@ -129,10 +129,12 @@ public static final

[GitHub] [hudi] xiarixiaoyao commented on pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-09-29 Thread GitBox
xiarixiaoyao commented on pull request #3203: URL: https://github.com/apache/hudi/pull/3203#issuecomment-929873306 @danny0405 @leesf thanks for your review, i will update the code。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] nsivabalan edited a comment on issue #3499: [SUPPORT] Inline Clustering fails with Hudi

2021-09-29 Thread GitBox
nsivabalan edited a comment on issue #3499: URL: https://github.com/apache/hudi/issues/3499#issuecomment-930269121 @codejoyan : whats the key generator you are using? Can you confirm you are setting those params (key gen, record key, partition path) while setting these clustering configs

[GitHub] [hudi] hudi-bot commented on pull request #3732: [HUDI-2499] Make jdbc-url, user and pass as non-required for other sync-modes

2021-09-29 Thread GitBox
hudi-bot commented on pull request #3732: URL: https://github.com/apache/hudi/pull/3732#issuecomment-930199853 ## CI report: * 546b457005237b5f7c2c51b9e1aef4b121110dc1 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] hudi-bot edited a comment on pull request #3668: [RFC-33] [HUDI-2429][WIP] Full schema evolution

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3668: URL: https://github.com/apache/hudi/pull/3668#issuecomment-919855741 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] hudi-bot commented on pull request #3730: [MINOR] Fix typo,'Hooodie' corrected to 'Hoodie' & 'reuqired' corrected to 'required'

2021-09-29 Thread GitBox
hudi-bot commented on pull request #3730: URL: https://github.com/apache/hudi/pull/3730#issuecomment-929947925 ## CI report: * f7f969d842ed6181f8c4cf1094ed9fff383d7d60 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] codope edited a comment on issue #3713: [SUPPORT] Cannot read from Hudi table created by same Spark job

2021-09-29 Thread GitBox
codope edited a comment on issue #3713: URL: https://github.com/apache/hudi/issues/3713#issuecomment-930112068 > The end-goal is to have a docker image which can be used to run tests locally on a dev machine. For this I would suggest to give our readymade [docker

[GitHub] [hudi] zhangyue19921010 commented on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

2021-09-29 Thread GitBox
zhangyue19921010 commented on pull request #3413: URL: https://github.com/apache/hudi/pull/3413#issuecomment-929987870 Hi @xushiyan Thanks a lot for your attention and review. My bad for misunderstanding :) code changed and waiting for ci/cd green. -- This is an automated message from

[GitHub] [hudi] tangyoupeng commented on a change in pull request #3729: Support JuiceFileSystem

2021-09-29 Thread GitBox
tangyoupeng commented on a change in pull request #3729: URL: https://github.com/apache/hudi/pull/3729#discussion_r718202955 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/StorageSchemes.java ## @@ -62,6 +62,8 @@ OBS("obs", false), // Kingsoft

[GitHub] [hudi] codope commented on issue #3607: [SUPPORT]Presto query hudi data with metadata table enable un-successfully.

2021-09-29 Thread GitBox
codope commented on issue #3607: URL: https://github.com/apache/hudi/issues/3607#issuecomment-930110816 @zhangyue19921010 You're right. I think you're talking about #3623 Just left a small comment there. If that works let's merge it. I can also help in testing the dependency changes.

[GitHub] [hudi] codope commented on a change in pull request #3623: [WIP][HUDI-2409] Using HBase shaded jars in Hudi presto bundle

2021-09-29 Thread GitBox
codope commented on a change in pull request #3623: URL: https://github.com/apache/hudi/pull/3623#discussion_r718434282 ## File path: packaging/hudi-presto-bundle/pom.xml ## @@ -158,11 +158,66 @@ org.apache.hudi hudi-common ${project.version} + +

[GitHub] [hudi] nsivabalan commented on issue #3297: HoodieMetadataException throwed when execute merge and hoodie.metadata.enable='true'

2021-09-29 Thread GitBox
nsivabalan commented on issue #3297: URL: https://github.com/apache/hudi/issues/3297#issuecomment-930255663 @zxding : can you respond to Sagar's question above. I am going to try reproducing this issue. would appreciate if you can give more specifics. -- This is an automated message

[GitHub] [hudi] yihua commented on pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
yihua commented on pull request #3727: URL: https://github.com/apache/hudi/pull/3727#issuecomment-930614238 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [hudi] zhangyue19921010 commented on a change in pull request #3719: [HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requ

2021-09-29 Thread GitBox
zhangyue19921010 commented on a change in pull request #3719: URL: https://github.com/apache/hudi/pull/3719#discussion_r718205032 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieROTablePathFilter.java ## @@ -175,8 +181,12 @@ public boolean accept(Path

[GitHub] [hudi] hudi-bot edited a comment on pull request #3732: [HUDI-2499] Make jdbc-url, user and pass as non-required for other sync-modes

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3732: URL: https://github.com/apache/hudi/pull/3732#issuecomment-930199853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] xushiyan edited a comment on issue #3670: [SUPPORT] SQL stmt managed table, not update/delete with datasource API

2021-09-29 Thread GitBox
xushiyan edited a comment on issue #3670: URL: https://github.com/apache/hudi/issues/3670#issuecomment-930335250 i can replicate the issue in local env, even by setting `ComplexKeyGenerator`, datasource delete not working, while delete via spark sql worked. Filing a JIRA to have detailed

[GitHub] [hudi] hudi-bot commented on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot commented on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] danny0405 commented on pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-09-29 Thread GitBox
danny0405 commented on pull request #3203: URL: https://github.com/apache/hudi/pull/3203#issuecomment-929775609 Changes the title and commit message to "[HUDI-2086] Redo the logic of mor incremental view for hive" -- This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] codope commented on issue #2544: [SUPPORT]failed to read timestamp column in version 0.7.0 even when HIVE_SUPPORT_TIMESTAMP is enabled

2021-09-29 Thread GitBox
codope commented on issue #2544: URL: https://github.com/apache/hudi/issues/2544#issuecomment-930078959 This issue has been fixed and is no longer reproducible. Here's the gist with the latest master code: https://gist.github.com/codope/fea4455d84d37496e8f518afdc803795 -- This is an

[GitHub] [hudi] leesf commented on a change in pull request #3719: [HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requests

2021-09-29 Thread GitBox
leesf commented on a change in pull request #3719: URL: https://github.com/apache/hudi/pull/3719#discussion_r718137137 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieROTablePathFilter.java ## @@ -175,8 +181,12 @@ public boolean accept(Path path) {

[GitHub] [hudi] hudi-bot edited a comment on pull request #3730: [MINOR] Fix typo,'Hooodie' corrected to 'Hoodie' & 'reuqired' corrected to 'required'

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3730: URL: https://github.com/apache/hudi/pull/3730#issuecomment-929947925 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] xiarixiaoyao commented on pull request #3330: [HUDI-2101][RFC-28]support z-order for hudi

2021-09-29 Thread GitBox
xiarixiaoyao commented on pull request #3330: URL: https://github.com/apache/hudi/pull/3330#issuecomment-930004681 kindly ping @vinothchandar . already rebase the code. could you help me review this code again, thanks. -- This is an automated message from the Apache Git Service. To

[GitHub] [hudi] nsivabalan commented on issue #2992: [SUPPORT] Insert_Override Api not working as expected in Hudi 0.7.0

2021-09-29 Thread GitBox
nsivabalan commented on issue #2992: URL: https://github.com/apache/hudi/issues/2992#issuecomment-930092640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[GitHub] [hudi] ibuda commented on issue #3728: [SUPPORT] Hudi Flink S3 Java Example

2021-09-29 Thread GitBox
ibuda commented on issue #3728: URL: https://github.com/apache/hudi/issues/3728#issuecomment-929883637 Thank you @danny0405 for the link. I am trying to set up an AWS Kinesis Application project, and as a start, I used the code provided by you in the link. Although I used the

[GitHub] [hudi] fengjian428 commented on pull request #3671: [HUDI-2418] add HiveSchemaProvider

2021-09-29 Thread GitBox
fengjian428 commented on pull request #3671: URL: https://github.com/apache/hudi/pull/3671#issuecomment-929816532 > @fengjian428 Check CI again? what you mean? the checks below all passed -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [hudi] nsivabalan commented on issue #3499: [SUPPORT] Inline Clustering fails with Hudi

2021-09-29 Thread GitBox
nsivabalan commented on issue #3499: URL: https://github.com/apache/hudi/issues/3499#issuecomment-930269121 @codejoyan : whats the key generator you are using? Can you confirm you are setting those params (key gen, record key, partition path) while setting these clustering configs as

[GitHub] [hudi] xushiyan commented on issue #3670: [SUPPORT] SQL stmt managed table, not update/delete with datasource API

2021-09-29 Thread GitBox
xushiyan commented on issue #3670: URL: https://github.com/apache/hudi/issues/3670#issuecomment-930335250 i can replicate the issue in local env, even by setting `ComplexKeyGenerator`, datasource delete not working, while delete via spark sql delete. Filing a JIRA to have detailed scripts

[GitHub] [hudi] parisni commented on issue #3731: [SUPPORT] Concurrent write (OCC) on distinct partitions random errors

2021-09-29 Thread GitBox
parisni commented on issue #3731: URL: https://github.com/apache/hudi/issues/3731#issuecomment-930043315 tried to dig the source code, there is nothing about partition level lock. Apparently the lock mecanism is on the database+table level only.

[GitHub] [hudi] danny0405 commented on a change in pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-09-29 Thread GitBox
danny0405 commented on a change in pull request #3203: URL: https://github.com/apache/hudi/pull/3203#discussion_r718116742 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieMergedLogReader.java ## @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache

[GitHub] [hudi] leesf commented on a change in pull request #3729: Support JuiceFileSystem

2021-09-29 Thread GitBox
leesf commented on a change in pull request #3729: URL: https://github.com/apache/hudi/pull/3729#discussion_r718152109 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/StorageSchemes.java ## @@ -62,6 +62,8 @@ OBS("obs", false), // Kingsoft Standard

[GitHub] [hudi] hudi-bot edited a comment on pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3413: URL: https://github.com/apache/hudi/pull/3413#issuecomment-893311636 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [hudi] sebastiantruj commented on issue #2688: [SUPPORT] Sync to Hive using Metastore

2021-09-29 Thread GitBox
sebastiantruj commented on issue #2688: URL: https://github.com/apache/hudi/issues/2688#issuecomment-929891629 Hi @stym06, could you post a spark-submit/spark-shell example of your workaround? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [hudi] xiarixiaoyao commented on a change in pull request #3203: [HUDI-2086] Redo the logical of mor_incremental_view for hive

2021-09-29 Thread GitBox
xiarixiaoyao commented on a change in pull request #3203: URL: https://github.com/apache/hudi/pull/3203#discussion_r718189151 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieParquetRealtimeInputFormat.java ## @@ -66,6 +90,138 @@ return

[GitHub] [hudi] yihua closed pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
yihua closed pull request #3727: URL: https://github.com/apache/hudi/pull/3727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] danny0405 commented on a change in pull request #3637: [HUDI-2301] Support Flink async compaction scheduling

2021-09-29 Thread GitBox
danny0405 commented on a change in pull request #3637: URL: https://github.com/apache/hudi/pull/3637#discussion_r718091146 ## File path: hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java ## @@ -430,14 +430,14 @@ private FlinkOptions() { public static

[GitHub] [hudi] zhangyue19921010 commented on issue #3607: [SUPPORT]Presto query hudi data with metadata table enable un-successfully.

2021-09-29 Thread GitBox
zhangyue19921010 commented on issue #3607: URL: https://github.com/apache/hudi/issues/3607#issuecomment-930028288 Are you able to query normal Hudi data tables using Presto? => Yes, normal hudi table works. Are you getting the NoClassDefFoundError only when you query Hudi metadata

[GitHub] [hudi] hudi-bot edited a comment on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 Azure:

[GitHub] [hudi] codope commented on issue #3713: [SUPPORT] Cannot read from Hudi table created by same Spark job

2021-09-29 Thread GitBox
codope commented on issue #3713: URL: https://github.com/apache/hudi/issues/3713#issuecomment-930112068 > The end-goal is to have a docker image which can be used to run tests locally on a dev machine. For this I would suggest to give our readymade [docker

[GitHub] [hudi] codope commented on issue #2439: [SUPPORT] Unable to sync with external hive metastore via metastore uris in the thrift protocol

2021-09-29 Thread GitBox
codope commented on issue #2439: URL: https://github.com/apache/hudi/issues/2439#issuecomment-930059601 @rakeshramakrishnan For hive sync to work inline through Hudi, the hive-site.xml at /conf should also be placed under /conf and it should have the correct metastore uri. Can you check

[GitHub] [hudi] danny0405 commented on a change in pull request #3719: [HUDI-2489]Tuning HoodieROTablePathFilter by caching hoodieTableFileSystemView, aiming to reduce unnecessary list/get requests

2021-09-29 Thread GitBox
danny0405 commented on a change in pull request #3719: URL: https://github.com/apache/hudi/pull/3719#discussion_r718104393 ## File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieROTablePathFilter.java ## @@ -175,8 +181,12 @@ public boolean accept(Path path) {

[GitHub] [hudi] danny0405 commented on issue #3728: [SUPPORT] Hudi Flink S3 Java Example

2021-09-29 Thread GitBox
danny0405 commented on issue #3728: URL: https://github.com/apache/hudi/issues/3728#issuecomment-929754237 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

[jira] [Created] (HUDI-2502) Refactor index in hudi-client module

2021-09-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-2502: --- Summary: Refactor index in hudi-client module Key: HUDI-2502 URL: https://issues.apache.org/jira/browse/HUDI-2502 Project: Apache Hudi Issue Type: Task

[jira] [Created] (HUDI-2501) Refactor compaction actions in hudi-client module

2021-09-29 Thread Ethan Guo (Jira)
Ethan Guo created HUDI-2501: --- Summary: Refactor compaction actions in hudi-client module Key: HUDI-2501 URL: https://issues.apache.org/jira/browse/HUDI-2501 Project: Apache Hudi Issue Type: Task

[jira] [Updated] (HUDI-2497) Refactor clean and restore actions in hudi-client module

2021-09-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo updated HUDI-2497: Summary: Refactor clean and restore actions in hudi-client module (was: Refactor clean, restore, and

[jira] [Closed] (HUDI-2433) Refactor table.action.rollback package in hudi-client module

2021-09-29 Thread Ethan Guo (Jira)
[ https://issues.apache.org/jira/browse/HUDI-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ethan Guo closed HUDI-2433. --- Resolution: Done > Refactor table.action.rollback package in hudi-client module >

[GitHub] [hudi] hudi-bot edited a comment on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot edited a comment on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 Azure:

[GitHub] [hudi] hudi-bot commented on pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
hudi-bot commented on pull request #3734: URL: https://github.com/apache/hudi/pull/3734#issuecomment-930618920 ## CI report: * 3cec644131a4fda77510a97d548d3633b4731e78 UNKNOWN Bot commands @hudi-bot supports the following commands: - `@hudi-bot run travis`

[GitHub] [hudi] yihua opened a new pull request #3734: [HUDI-2497] Refactor clean and restore actions in hudi-client module

2021-09-29 Thread GitBox
yihua opened a new pull request #3734: URL: https://github.com/apache/hudi/pull/3734 ## What is the purpose of the pull request This PR refactors the clean and restore actions in hudi-client module to extract common logic into hudi-client-common module and reduce LoC. ##

[GitHub] [hudi] yihua commented on pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
yihua commented on pull request #3727: URL: https://github.com/apache/hudi/pull/3727#issuecomment-930614516 @nsivabalan I'll address your comments in the PR of refactoring compaction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] yihua closed pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
yihua closed pull request #3727: URL: https://github.com/apache/hudi/pull/3727 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] yihua commented on pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
yihua commented on pull request #3727: URL: https://github.com/apache/hudi/pull/3727#issuecomment-930614238 I'm going to split this PR into two given the complexity of the compaction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] nsivabalan commented on a change in pull request #3727: [HUDI-2497] Refactor clean, restore, and compaction actions in hudi-client module

2021-09-29 Thread GitBox
nsivabalan commented on a change in pull request #3727: URL: https://github.com/apache/hudi/pull/3727#discussion_r718849052 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/CompactHelpers.java ## @@ -0,0 +1,178 @@ +/* + *

[GitHub] [hudi] davehagman commented on issue #3733: [SUPPORT] Periodic and sustained latency spikes during index lookup

2021-09-29 Thread GitBox
davehagman commented on issue #3733: URL: https://github.com/apache/hudi/issues/3733#issuecomment-930488974 I would also like to show some graphs comparing the baseline and what they look like during these latency spikes. This is a graph of the number of files Created vs. Updated

[GitHub] [hudi] davehagman opened a new issue #3733: [SUPPORT] Periodic and sustained latency spikes during index lookup

2021-09-29 Thread GitBox
davehagman opened a new issue #3733: URL: https://github.com/apache/hudi/issues/3733 **Describe the problem you faced** We're running Hudi 0.9 in production and we are seeing intermittent issues where the latency introduced by index operations spiked to 8-10x longer than usual

[GitHub] [hudi] xushiyan edited a comment on issue #3670: [SUPPORT] SQL stmt managed table, not update/delete with datasource API

2021-09-29 Thread GitBox
xushiyan edited a comment on issue #3670: URL: https://github.com/apache/hudi/issues/3670#issuecomment-930335250 i can replicate the issue in local env, even by setting `ComplexKeyGenerator`, datasource delete not working, while delete via spark sql worked. Filing a JIRA to have detailed

[GitHub] [hudi] xushiyan commented on issue #3670: [SUPPORT] SQL stmt managed table, not update/delete with datasource API

2021-09-29 Thread GitBox
xushiyan commented on issue #3670: URL: https://github.com/apache/hudi/issues/3670#issuecomment-930335250 i can replicate the issue in local env, even by setting `ComplexKeyGenerator`, datasource delete not working, while delete via spark sql delete. Filing a JIRA to have detailed scripts

[jira] [Created] (HUDI-2500) Spark datasource delete not working on Spark SQL created table

2021-09-29 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-2500: Summary: Spark datasource delete not working on Spark SQL created table Key: HUDI-2500 URL: https://issues.apache.org/jira/browse/HUDI-2500 Project: Apache Hudi

[hudi] branch master updated: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource (#3413)

2021-09-29 Thread xushiyan
This is an automated email from the ASF dual-hosted git repository. xushiyan pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/master by this push: new dd1bd62 [HUDI-2277] HoodieDeltaStreamer

[GitHub] [hudi] xushiyan merged pull request #3413: [HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource

2021-09-29 Thread GitBox
xushiyan merged pull request #3413: URL: https://github.com/apache/hudi/pull/3413 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [hudi] yanghua commented on pull request #3671: [HUDI-2418] add HiveSchemaProvider

2021-09-29 Thread GitBox
yanghua commented on pull request #3671: URL: https://github.com/apache/hudi/pull/3671#issuecomment-930287792 > > @fengjian428 Check CI again? > > what you mean? the checks below all passed I mean the Azure CI, please check

[GitHub] [hudi] nsivabalan edited a comment on issue #3499: [SUPPORT] Inline Clustering fails with Hudi

2021-09-29 Thread GitBox
nsivabalan edited a comment on issue #3499: URL: https://github.com/apache/hudi/issues/3499#issuecomment-930269121 @codejoyan : whats the key generator you are using? Can you confirm you are setting those params (key gen, record key, partition path) while setting these clustering configs

[GitHub] [hudi] nsivabalan commented on issue #3499: [SUPPORT] Inline Clustering fails with Hudi

2021-09-29 Thread GitBox
nsivabalan commented on issue #3499: URL: https://github.com/apache/hudi/issues/3499#issuecomment-930269121 @codejoyan : whats the key generator you are using? Can you confirm you are setting those params (key gen, record key, partition path) while setting these clustering configs as

  1   2   >