[GitHub] [hudi] codecov-io edited a comment on pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2021-02-01 Thread GitBox
codecov-io edited a comment on pull request #2296: URL: https://github.com/apache/hudi/pull/2296#issuecomment-738779135 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2296?src=pr&el=h1) Report > Merging [#2296](https://codecov.io/gh/apache/hudi/pull/2296?src=pr&el=desc) (673e8e7) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2296: [HUDI-1425] Performance loss with the additional hoodieRecords.isEmpt…

2021-02-01 Thread GitBox
codecov-io edited a comment on pull request #2296: URL: https://github.com/apache/hudi/pull/2296#issuecomment-738779135 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

[GitHub] [hudi] yanghua commented on a change in pull request #2506: [HUDI-1557] Make Flink write pipeline write task scalable

2021-02-01 Thread GitBox
yanghua commented on a change in pull request #2506: URL: https://github.com/apache/hudi/pull/2506#discussion_r568370892 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java ## @@ -249,7 +250,17 @@ public String getLastCo

[GitHub] [hudi] jtmzheng commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

2021-02-01 Thread GitBox
jtmzheng commented on issue #2470: URL: https://github.com/apache/hudi/issues/2470#issuecomment-771421761 @n3nash I have not had a chance to look at 0.7.0 migration yet, what EMR versions is 0.7.0 compatible with? If my dataset is on 0.6.0 already do I just need to update the Hudi ja

[GitHub] [hudi] n3nash commented on issue #2513: [SUPPORT]Hive-Cli set hive.input.format=org.apache.hudi.hadoop.HoodieParquetInputFormat and query error

2021-02-01 Thread GitBox
n3nash commented on issue #2513: URL: https://github.com/apache/hudi/issues/2513#issuecomment-771416134 @GintokiYs You should not set the hive input format that way. You can set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat. As long as your table is registered as a

[GitHub] [hudi] n3nash commented on issue #2509: [SUPPORT]Hudi saves TimestampType as bigInt

2021-02-01 Thread GitBox
n3nash commented on issue #2509: URL: https://github.com/apache/hudi/issues/2509#issuecomment-771414366 @satishkotha Could you take a look at this one ? This is an automated message from the Apache Git Service. To respond to

[GitHub] [hudi] n3nash commented on issue #2508: [SUPPORT] Error upserting bucketType UPDATE for partition

2021-02-01 Thread GitBox
n3nash commented on issue #2508: URL: https://github.com/apache/hudi/issues/2508#issuecomment-771413944 @nsivabalan Do you think this may have something to do with the Encoders needed in the row write path ? This is an autom

[GitHub] [hudi] n3nash commented on issue #2507: [SUPPORT] Error when Hudi metadata enabled for non partitioned tables

2021-02-01 Thread GitBox
n3nash commented on issue #2507: URL: https://github.com/apache/hudi/issues/2507#issuecomment-771412820 @prashantwason Can you take a look at this ? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] n3nash commented on issue #2490: spark read hudi data from hive

2021-02-01 Thread GitBox
n3nash commented on issue #2490: URL: https://github.com/apache/hudi/issues/2490#issuecomment-771411738 @Ishg Any update ? This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [hudi] n3nash commented on issue #2489: [SUPPORT]

2021-02-01 Thread GitBox
n3nash commented on issue #2489: URL: https://github.com/apache/hudi/issues/2489#issuecomment-771411421 @Ishg Do you have any update ? This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [hudi] n3nash commented on issue #2482: [SUPPORT]

2021-02-01 Thread GitBox
n3nash commented on issue #2482: URL: https://github.com/apache/hudi/issues/2482#issuecomment-771410764 @duanyongvictory Were you able to use the latest release 0.7.0 and see if it resolves your issue ? This is an automated

[GitHub] [hudi] n3nash commented on issue #2470: [SUPPORT] Heavy skew in ListingBasedRollbackHelper

2021-02-01 Thread GitBox
n3nash commented on issue #2470: URL: https://github.com/apache/hudi/issues/2470#issuecomment-771409047 @jtmzheng Does using 0.7.0 and `hoodie.metadata.enable=true` solve the issue ? This is an automated message from the Apa

[GitHub] [hudi] n3nash commented on issue #2463: [SUPPORT] Tuning Hudi Upsert Job

2021-02-01 Thread GitBox
n3nash commented on issue #2463: URL: https://github.com/apache/hudi/issues/2463#issuecomment-771408348 @rubenssoto Did increasing the num executors help ? This is an automated message from the Apache Git Service. To respond

[GitHub] [hudi] n3nash commented on issue #2461: All records are present in athena query result on glue crawled Hudi tables

2021-02-01 Thread GitBox
n3nash commented on issue #2461: URL: https://github.com/apache/hudi/issues/2461#issuecomment-771407719 @vrtrepp @noobarcitect Are you able to use the hive-sync tool to resolve your issue ? This is an automated message from

[GitHub] [hudi] n3nash commented on issue #2448: [SUPPORT] deltacommit for client 172.16.116.102 already exists

2021-02-01 Thread GitBox
n3nash commented on issue #2448: URL: https://github.com/apache/hudi/issues/2448#issuecomment-771406877 @peng-xin Are you able to proceed with `hoodie.compact.inline -> true` and `hoodie.auto.commit -> false` ? This is an au

[GitHub] [hudi] n3nash commented on issue #2439: [SUPPORT] Unable to sync with external hive metastore via metastore uris in the thrift protocol

2021-02-01 Thread GitBox
n3nash commented on issue #2439: URL: https://github.com/apache/hudi/issues/2439#issuecomment-771402960 @rakeshramakrishnan Could you try the above patch from @Trevor-zhang and see if that fixes your issue ? This is an autom

[GitHub] [hudi] n3nash commented on issue #2409: [SUPPORT] Spark structured Streaming writes to Hudi and synchronizes Hive to create only read-optimized tables without creating real-time tables

2021-02-01 Thread GitBox
n3nash commented on issue #2409: URL: https://github.com/apache/hudi/issues/2409#issuecomment-771397330 @wosow Were you able to resolve your issue ? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [hudi] n3nash commented on issue #2437: deltastreamer fails due to "Error upserting bucketType UPDATE for partition" and ArrayIndexOutOfBoundsException

2021-02-01 Thread GitBox
n3nash commented on issue #2437: URL: https://github.com/apache/hudi/issues/2437#issuecomment-771401741 @jiangok2006 Were you able to run with the setting hoodie.avro.schema.validate=true ? My feeling is this is related schema and decoding of records using the provided schema ---

[GitHub] [hudi] n3nash commented on issue #2406: [SUPPORT] Deltastreamer - Property hoodie.datasource.write.partitionpath.field not found

2021-02-01 Thread GitBox
n3nash commented on issue #2406: URL: https://github.com/apache/hudi/issues/2406#issuecomment-771394795 @SureshK-T2S Is there anything else related to this issue that needs to be discussed further ? This is an automated mess

[GitHub] [hudi] codecov-io edited a comment on pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
codecov-io edited a comment on pull request #2496: URL: https://github.com/apache/hudi/pull/2496#issuecomment-768170324 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2496?src=pr&el=h1) Report > Merging [#2496](https://codecov.io/gh/apache/hudi/pull/2496?src=pr&el=desc) (1d4120a) in

[GitHub] [hudi] codecov-io edited a comment on pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
codecov-io edited a comment on pull request #2496: URL: https://github.com/apache/hudi/pull/2496#issuecomment-768170324 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2496?src=pr&el=h1) Report > Merging [#2496](https://codecov.io/gh/apache/hudi/pull/2496?src=pr&el=desc) (1d4120a) in

[GitHub] [hudi] n3nash commented on issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
n3nash commented on issue #2515: URL: https://github.com/apache/hudi/issues/2515#issuecomment-771296756 [add_default.txt](https://github.com/apache/hudi/files/5908168/add_default.txt) Can you apply this diff, build your code again and try ? -

[GitHub] [hudi] n3nash commented on issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
n3nash commented on issue #2515: URL: https://github.com/apache/hudi/issues/2515#issuecomment-771293068 @rubenssoto It looks something like what is described here -> https://stackoverflow.com/questions/45662469/storing-null-values-in-avro-files. I'm trying to track down any change in the s

[jira] [Commented] (HUDI-1566) Typo in account request caused wrong name in Apache id

2021-02-01 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276787#comment-17276787 ] wangxianghu commented on HUDI-1566: --- [~clr] Yes ,I found it later can you help ping som

[jira] [Commented] (HUDI-1566) Typo in account request caused wrong name in Apache id

2021-02-01 Thread Craig L Russell (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276785#comment-17276785 ] Craig L Russell commented on HUDI-1566: --- The linked INFRA issue is where you can fin

[jira] [Commented] (HUDI-1566) Typo in account request caused wrong name in Apache id

2021-02-01 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276777#comment-17276777 ] wangxianghu commented on HUDI-1566: --- Hi [~clr],is there any progress on correcting the i

[GitHub] [hudi] codecov-io commented on pull request #2516: [MINOR] Fixing the default value for source ordering field in payload config

2021-02-01 Thread GitBox
codecov-io commented on pull request #2516: URL: https://github.com/apache/hudi/pull/2516#issuecomment-771264229 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2516?src=pr&el=h1) Report > Merging [#2516](https://codecov.io/gh/apache/hudi/pull/2516?src=pr&el=desc) (7c1f105) into [ma

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568234109 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java ## @@ -192,27 +261,74 @@ public FSDataOutputStream crea

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568234109 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java ## @@ -192,27 +261,74 @@ public FSDataOutputStream crea

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568229644 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java ## @@ -192,27 +261,74 @@ public FSDataOutputStream crea

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568228504 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java ## @@ -118,12 +156,31 @@ private static Registry getMet

[jira] [Closed] (HUDI-1555) clustering bugs from large scale testing

2021-02-01 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish closed HUDI-1555. > clustering bugs from large scale testing > > > Key: HUDI-1555

[jira] [Resolved] (HUDI-1555) clustering bugs from large scale testing

2021-02-01 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish resolved HUDI-1555. -- Resolution: Fixed > clustering bugs from large scale testing > > >

[GitHub] [hudi] vinothchandar commented on pull request #2359: [HUDI-1486] Remove inflight rollback in hoodie writer

2021-02-01 Thread GitBox
vinothchandar commented on pull request #2359: URL: https://github.com/apache/hudi/pull/2359#issuecomment-771176720 @n3nash rebase, repush and get CI to be happy? This is an automated message from the Apache Git Service. To r

[GitHub] [hudi] vinothchandar commented on pull request #2374: [HUDI-845] Added locking capability to allow multiple writers

2021-02-01 Thread GitBox
vinothchandar commented on pull request #2374: URL: https://github.com/apache/hudi/pull/2374#issuecomment-771176420 @n3nash can you please rebase and repush This is an automated message from the Apache Git Service. To respon

[GitHub] [hudi] vinothchandar commented on a change in pull request #2359: [HUDI-1486] Remove inflight rollback in hoodie writer

2021-02-01 Thread GitBox
vinothchandar commented on a change in pull request #2359: URL: https://github.com/apache/hudi/pull/2359#discussion_r568157021 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/heartbeat/HoodieHeartbeatClient.java ## @@ -0,0 +1,265 @@ +/* + * Li

[GitHub] [hudi] vinothchandar commented on pull request #2374: [HUDI-845] Added locking capability to allow multiple writers

2021-02-01 Thread GitBox
vinothchandar commented on pull request #2374: URL: https://github.com/apache/hudi/pull/2374#issuecomment-771172541 @n3nash Beginning the final review of this. Should we still land #2359 first? This is an automated message

[GitHub] [hudi] nsivabalan opened a new pull request #2516: [MINOR] Fixing the default value for source ordering field for payload config

2021-02-01 Thread GitBox
nsivabalan opened a new pull request #2516: URL: https://github.com/apache/hudi/pull/2516 ## What is the purpose of the pull request Fixing the default value for source ordering field for payload config ## Brief change log *(for example:)* - Fixing the default value

[GitHub] [hudi] rubenssoto commented on issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
rubenssoto commented on issue #2515: URL: https://github.com/apache/hudi/issues/2515#issuecomment-771122751 Any idea how to know which commit will be archived? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] rubenssoto commented on issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
rubenssoto commented on issue #2515: URL: https://github.com/apache/hudi/issues/2515#issuecomment-771095127 [hudi_files.txt](https://github.com/apache/hudi/files/5906203/hudi_files.txt) list of my commits. This is

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568073678 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java ## @@ -419,13 +417,8 @@ public static boolean isLogFile(Path logPath) {

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568069335 ## File path: hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieEngineContext.java ## @@ -54,6 +54,10 @@ public TaskContextSupplier getTas

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568069059 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/TimedSizeAwareOutputStream.java ## @@ -20,19 +20,21 @@ import org.apache.hudi.exc

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568068188 ## File path: hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieLogFileReader.java ## @@ -79,15 +79,12 @@ public HoodieLogFileReader(Fi

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568067730 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/TimedFSInputStream.java ## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software

[GitHub] [hudi] prashantwason commented on pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on pull request #2496: URL: https://github.com/apache/hudi/pull/2496#issuecomment-771082643 > Do we need to enable metrics FS(time aware, size aware) as well by default whenever buffering is enabled There are two parts to metrics in HUDI: 1. Metrics in-memo

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568062733 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -1193,6 +1218,21 @@ public Builder with

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568061380 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java ## @@ -931,6 +941,21 @@ public int getMetadat

[GitHub] [hudi] prashantwason commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
prashantwason commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r568060924 ## File path: hudi-common/src/main/java/org/apache/hudi/common/fs/HoodieWrapperFileSystem.java ## @@ -79,22 +84,51 @@ public static void setMetricsReg

[GitHub] [hudi] vinothchandar commented on pull request #2458: [MINOR] Rename FileSystemViewHandler to Router and corrected the class comment

2021-02-01 Thread GitBox
vinothchandar commented on pull request #2458: URL: https://github.com/apache/hudi/pull/2458#issuecomment-771044188 >Currently, we can keep it. But when we provide more services in the future, maybe it will be a bit limited? Agree. We can rename as we add more and more. To clarify, I

[jira] [Commented] (HUDI-1550) Incorrect query result for MOR table when merge base data with log

2021-02-01 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276535#comment-17276535 ] sivabalan narayanan commented on HUDI-1550: --- Sure, I will look into this sometim

[GitHub] [hudi] n3nash commented on issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
n3nash commented on issue #2515: URL: https://github.com/apache/hudi/issues/2515#issuecomment-771031947 @rubenssoto Do you know which commits were being archived ? If yes, can you paste the contents of those commits ? This i

[GitHub] [hudi] n3nash commented on a change in pull request #2514: [HUDI-1571] Adding commit_show_records_info to display record sizes for commit

2021-02-01 Thread GitBox
n3nash commented on a change in pull request #2514: URL: https://github.com/apache/hudi/pull/2514#discussion_r568012140 ## File path: hudi-cli/src/main/java/org/apache/hudi/cli/commands/CommitsCommand.java ## @@ -314,6 +314,45 @@ public String showCommitPartitions( li

[GitHub] [hudi] stackfun commented on issue #2367: [SUPPORT] Seek error when querying MOR Tables in GCP

2021-02-01 Thread GitBox
stackfun commented on issue #2367: URL: https://github.com/apache/hudi/issues/2367#issuecomment-771012749 Hope to have time to test this specific issue today. I'll be happy to help test release candidates on GCP. Are there automated tests that run on AWS infrastructure? The integrati

[GitHub] [hudi] nsivabalan commented on issue #2367: [SUPPORT] Seek error when querying MOR Tables in GCP

2021-02-01 Thread GitBox
nsivabalan commented on issue #2367: URL: https://github.com/apache/hudi/issues/2367#issuecomment-770999681 I am not very sure if we do any testing w/ GCP. @bvaradar @n3nash @vinothchandar for reference. But if you can help us test GCP before every release, that would really be awesome.

[GitHub] [hudi] codecov-io edited a comment on pull request #2514: [HUDI-1571] Adding commit_show_records_info to display record sizes for commit

2021-02-01 Thread GitBox
codecov-io edited a comment on pull request #2514: URL: https://github.com/apache/hudi/pull/2514#issuecomment-770955295 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2514?src=pr&el=h1) Report > Merging [#2514](https://codecov.io/gh/apache/hudi/pull/2514?src=pr&el=desc) (5492b60) in

[GitHub] [hudi] codecov-io commented on pull request #2514: [HUDI-1571] Adding commit_show_records_info to display record sizes for commit

2021-02-01 Thread GitBox
codecov-io commented on pull request #2514: URL: https://github.com/apache/hudi/pull/2514#issuecomment-770955295 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2514?src=pr&el=h1) Report > Merging [#2514](https://codecov.io/gh/apache/hudi/pull/2514?src=pr&el=desc) (5492b60) into [ma

[jira] [Resolved] (HUDI-1550) Incorrect query result for MOR table when merge base data with log

2021-02-01 Thread Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li resolved HUDI-1550. --- Resolution: Resolved > Incorrect query result for MOR table when merge base data with log > --

[jira] [Closed] (HUDI-1550) Incorrect query result for MOR table when merge base data with log

2021-02-01 Thread Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Li closed HUDI-1550. - > Incorrect query result for MOR table when merge base data with log > ---

[GitHub] [hudi] garyli1019 merged pull request #2497: [HUDI-1550] Incorrect query result for MOR table when merge base data…

2021-02-01 Thread GitBox
garyli1019 merged pull request #2497: URL: https://github.com/apache/hudi/pull/2497 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[hudi] branch master updated (f159c0c -> 0d8a4d0)

2021-02-01 Thread garyli
This is an automated email from the ASF dual-hosted git repository. garyli pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from f159c0c [HUDI-1519] Improve minKey/maxKey computation in HoodieHFileWriter (#2427) add 0d8a4d0 [HUDI-1550] Hono

[hudi] branch master updated (5d053b4 -> f159c0c)

2021-02-01 Thread sivabalan
This is an automated email from the ASF dual-hosted git repository. sivabalan pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hudi.git. from 5d053b4 [MINOR] Quickstart.generateUpdates method add check (#2505) add f159c0c [HUDI-1519] Improve minKey/ma

[GitHub] [hudi] nsivabalan merged pull request #2427: [HUDI-1519] Improve minKey/maxKey compute in HoodieHFileWriter

2021-02-01 Thread GitBox
nsivabalan merged pull request #2427: URL: https://github.com/apache/hudi/pull/2427 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] nsivabalan commented on pull request #2497: [HUDI-1550] Incorrect query result for MOR table when merge base data…

2021-02-01 Thread GitBox
nsivabalan commented on pull request #2497: URL: https://github.com/apache/hudi/pull/2497#issuecomment-770832362 @garyli1019 : I am good w/ the PR. Can you land it. Please fill in the right commit message while you squash & merge. -

[GitHub] [hudi] nsivabalan commented on pull request #2497: [HUDI-1550] Incorrect query result for MOR table when merge base data…

2021-02-01 Thread GitBox
nsivabalan commented on pull request #2497: URL: https://github.com/apache/hudi/pull/2497#issuecomment-770830774 Yes, sounds good @garyli1019 . LGTM. Feel free to land it if you are good. This is an automated message from th

[GitHub] [hudi] nsivabalan commented on a change in pull request #2443: [HUDI-1269] Make whether the failure of connect hive affects hudi ingest process configurable

2021-02-01 Thread GitBox
nsivabalan commented on a change in pull request #2443: URL: https://github.com/apache/hudi/pull/2443#discussion_r567795367 ## File path: hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/HiveSyncTool.java ## @@ -57,13 +57,23 @@ public static final String SUFFIX_R

[GitHub] [hudi] rubenssoto commented on issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
rubenssoto commented on issue #2515: URL: https://github.com/apache/hudi/issues/2515#issuecomment-770818099 I have another pipeline with a partitioned table and the problem didn't happen, is there any difference? This is an

[GitHub] [hudi] danny0405 commented on a change in pull request #2506: [HUDI-1557] Make Flink write pipeline write task scalable

2021-02-01 Thread GitBox
danny0405 commented on a change in pull request #2506: URL: https://github.com/apache/hudi/pull/2506#discussion_r567776880 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/HoodieFlinkWriteClient.java ## @@ -249,7 +250,17 @@ public String getLast

[GitHub] [hudi] nsivabalan edited a comment on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

2021-02-01 Thread GitBox
nsivabalan edited a comment on pull request #2486: URL: https://github.com/apache/hudi/pull/2486#issuecomment-770809175 Few suggestions: - Can you rebase and also fix the CI issue please. We will review once these are done and the patch is ready. - Also, suggest to create a jira and

[GitHub] [hudi] nsivabalan commented on pull request #2486: Filtering abnormal data which the recordKeyField or precombineField is null in avro format

2021-02-01 Thread GitBox
nsivabalan commented on pull request #2486: URL: https://github.com/apache/hudi/pull/2486#issuecomment-770809175 Few suggestions: - Can you rebase and also fix the CI issue please. We will review once these are done and the patch is ready. - Also, suggest to create a jira and link i

[GitHub] [hudi] danny0405 commented on a change in pull request #2506: [HUDI-1557] Make Flink write pipeline write task scalable

2021-02-01 Thread GitBox
danny0405 commented on a change in pull request #2506: URL: https://github.com/apache/hudi/pull/2506#discussion_r567774280 ## File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java ## @@ -250,4 +259,37 @@ public static void checkRequiredProperties(TypedPrope

[GitHub] [hudi] rubenssoto opened a new issue #2515: [SUPPORT] ERROR HoodieTimelineArchiveLog: Failed to archive commits

2021-02-01 Thread GitBox
rubenssoto opened a new issue #2515: URL: https://github.com/apache/hudi/issues/2515 Hello, Hudi version: 0.7 Emr version: 6.2 Spark version: 3.0.1 Hudi Options: Map(hoodie.datasource.hive_sync.database -> raw_courier_api_hudi, hoodie.parquet.small.file.limit -> 67

[GitHub] [hudi] danny0405 commented on a change in pull request #2506: [HUDI-1557] Make Flink write pipeline write task scalable

2021-02-01 Thread GitBox
danny0405 commented on a change in pull request #2506: URL: https://github.com/apache/hudi/pull/2506#discussion_r567769737 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/index/state/FlinkInMemoryStateIndex.java ## @@ -62,47 +61,14 @@ public FlinkInMe

[GitHub] [hudi] nsivabalan opened a new pull request #2514: [HUDI-1571] Adding commit_show_records_info to display record sizes for commit

2021-02-01 Thread GitBox
nsivabalan opened a new pull request #2514: URL: https://github.com/apache/hudi/pull/2514 ## What is the purpose of the pull request *Adding commit_show_records_info to display record sizes for commit* ## Brief change log - *Adding commit_show_records_info to display r

[jira] [Updated] (HUDI-1571) Expose record size info for commits w/ hudi-cli

2021-02-01 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1571: - Labels: pull-request-available (was: ) > Expose record size info for commits w/ hudi-cli > --

[jira] [Created] (HUDI-1571) Expose record size info for commits w/ hudi-cli

2021-02-01 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1571: - Summary: Expose record size info for commits w/ hudi-cli Key: HUDI-1571 URL: https://issues.apache.org/jira/browse/HUDI-1571 Project: Apache Hudi I

[jira] [Updated] (HUDI-1571) Expose record size info for commits w/ hudi-cli

2021-02-01 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-1571: -- Fix Version/s: 0.8.0 > Expose record size info for commits w/ hudi-cli > ---

[GitHub] [hudi] wangxianghu commented on a change in pull request #2506: [HUDI-1557] Make Flink write pipeline write task scalable

2021-02-01 Thread GitBox
wangxianghu commented on a change in pull request #2506: URL: https://github.com/apache/hudi/pull/2506#discussion_r567765521 ## File path: hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/index/state/FlinkInMemoryStateIndex.java ## @@ -62,47 +61,14 @@ public FlinkIn

[GitHub] [hudi] nsivabalan commented on a change in pull request #2400: Some fixes and enhancements to test suite framework

2021-02-01 Thread GitBox
nsivabalan commented on a change in pull request #2400: URL: https://github.com/apache/hudi/pull/2400#discussion_r567760070 ## File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/dag/nodes/ValidateAsyncOperations.java ## @@ -0,0 +1,119 @@ +/* + * Licensed

[GitHub] [hudi] zafer-sahin edited a comment on issue #2498: [SUPPORT] Hudi MERGE_ON_READ load to dataframe fails for the versions [0.6.0],[0.7.0] and runs for [0.5.3]

2021-02-01 Thread GitBox
zafer-sahin edited a comment on issue #2498: URL: https://github.com/apache/hudi/issues/2498#issuecomment-770769437 @nsivabalan I was able to execute all steps successfully in the [quick start](https://hudi.apache.org/docs/quick-start-guide.html) and I could reproduce the issue by changing

[GitHub] [hudi] zafer-sahin commented on issue #2498: [SUPPORT] Hudi MERGE_ON_READ load to dataframe fails for the versions [0.6.0],[0.7.0] and runs for [0.5.3]

2021-02-01 Thread GitBox
zafer-sahin commented on issue #2498: URL: https://github.com/apache/hudi/issues/2498#issuecomment-770769437 @nsivabalan I was able to execute all steps successfully in the [quick start](https://hudi.apache.org/docs/quick-start-guide.html) and I could reproduce the issue by changing the st

[GitHub] [hudi] yui2010 commented on a change in pull request #2427: [HUDI-1519] Improve minKey/maxKey compute in HoodieHFileWriter

2021-02-01 Thread GitBox
yui2010 commented on a change in pull request #2427: URL: https://github.com/apache/hudi/pull/2427#discussion_r567726161 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java ## @@ -121,17 +121,10 @@ public void writeAvro(

[GitHub] [hudi] yanghua commented on a change in pull request #2506: [HUDI-1557] Make Flink write pipeline write task scalable

2021-02-01 Thread GitBox
yanghua commented on a change in pull request #2506: URL: https://github.com/apache/hudi/pull/2506#discussion_r567513264 ## File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/InsertBucket.java ## @@ -26,9 +26,9 @@ */ public class Ins

[GitHub] [hudi] danny0405 commented on a change in pull request #2496: [HUDI-1554] Introduced buffering for streams in HUDI.

2021-02-01 Thread GitBox
danny0405 commented on a change in pull request #2496: URL: https://github.com/apache/hudi/pull/2496#discussion_r567627556 ## File path: hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieEngineContext.java ## @@ -54,6 +54,10 @@ public TaskContextSupplier getTaskCon

[GitHub] [hudi] codecov-io edited a comment on pull request #2510: [HUDI-1534]HiveSyncTool-It is not necessary to use JDBC and MetaStoreClient at the same time

2021-02-01 Thread GitBox
codecov-io edited a comment on pull request #2510: URL: https://github.com/apache/hudi/pull/2510#issuecomment-770686131 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2510?src=pr&el=h1) Report > Merging [#2510](https://codecov.io/gh/apache/hudi/pull/2510?src=pr&el=desc) (3bd625d) in

[GitHub] [hudi] codecov-io commented on pull request #2510: [HUDI-1534]HiveSyncTool-It is not necessary to use JDBC and MetaStoreClient at the same time

2021-02-01 Thread GitBox
codecov-io commented on pull request #2510: URL: https://github.com/apache/hudi/pull/2510#issuecomment-770686131 # [Codecov](https://codecov.io/gh/apache/hudi/pull/2510?src=pr&el=h1) Report > Merging [#2510](https://codecov.io/gh/apache/hudi/pull/2510?src=pr&el=desc) (3bd625d) into [ma