[GitHub] [incubator-hudi] hddong commented on pull request #1622: [HUDI-888] fix NullPointerException

2020-05-15 Thread GitBox
hddong commented on pull request #1622: URL: https://github.com/apache/incubator-hudi/pull/1622#issuecomment-629598373 @vinothchandar as @leesf said, it's occured occasionally . I think It maybe due to miniDFS, it has service which use random port (like NameNode RPC) when starting. ---

[jira] [Resolved] (HUDI-902) Avoid exception for getting SchemaProvider when no new input data

2020-05-15 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu resolved HUDI-902. - Resolution: Done > Avoid exception for getting SchemaProvider when no new input data >

[GitHub] [incubator-hudi] xushiyan commented on pull request #1623: [MINOR] Increase heap space for surefire

2020-05-15 Thread GitBox
xushiyan commented on pull request #1623: URL: https://github.com/apache/incubator-hudi/pull/1623#issuecomment-629587320 @bvaradar I found this doc saying the linux system has 7.5Gb memory https://docs.travis-ci.com/user/reference/overview/#virtualisation-environment-vs-operating-system

[incubator-hudi] branch asf-site updated: Travis CI build asf-site

2020-05-15 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 4b419ad Travis CI build asf-site 4b

[GitHub] [incubator-hudi] lamber-ken merged pull request #1635: [MINOR] Remove logos on home page

2020-05-15 Thread GitBox
lamber-ken merged pull request #1635: URL: https://github.com/apache/incubator-hudi/pull/1635 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[incubator-hudi] branch asf-site updated: [MINOR] Remove logos on home page (#1635)

2020-05-15 Thread lamberken
This is an automated email from the ASF dual-hosted git repository. lamberken pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 05a8ddb [MINOR] Remove logos on

[incubator-hudi] branch master updated: [HUDI-902] Avoid exception when getSchemaProvider (#1584)

2020-05-15 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new 2ada2ef [HUDI-902] Avoid exception whe

[GitHub] [incubator-hudi] bvaradar merged pull request #1584: [HUDI-902] Avoid exception when getSchemaProvider

2020-05-15 Thread GitBox
bvaradar merged pull request #1584: URL: https://github.com/apache/incubator-hudi/pull/1584 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1635: [MINOR] Remove logos on home page

2020-05-15 Thread GitBox
vinothchandar commented on pull request #1635: URL: https://github.com/apache/incubator-hudi/pull/1635#issuecomment-629586033 Please land.. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [incubator-hudi] lamber-ken opened a new pull request #1635: [MINOR] Remove logos on home page

2020-05-15 Thread GitBox
lamber-ken opened a new pull request #1635: URL: https://github.com/apache/incubator-hudi/pull/1635 ## What is the purpose of the pull request - Remove logos on home page - Move logos to powered by page. **Sync** https://lamber-ken.github.io https://lamber-ken.github.i

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #279

2020-05-15 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.38 KB...] /home/jenkins/tools/maven/apache-maven-3.5.4/conf: logging settings.xml toolchains.xml /home/jenkins/tools/maven/apache-maven-3.5.

[GitHub] [incubator-hudi] xushiyan commented on pull request #1584: [HUDI-902] Avoid exception when getSchemaProvider

2020-05-15 Thread GitBox
xushiyan commented on pull request #1584: URL: https://github.com/apache/incubator-hudi/pull/1584#issuecomment-629576325 @bvaradar The CI passed. It's ready for review now. Thanks. This is an automated message from the Apache

[jira] [Updated] (HUDI-902) Avoid exception for getting SchemaProvider when no new input data

2020-05-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-902: Labels: pull-request-available (was: ) > Avoid exception for getting SchemaProvider when no new inpu

[jira] [Assigned] (HUDI-895) Reduce listing .hoodie folder when using timeline server

2020-05-15 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan reassigned HUDI-895: --- Assignee: Balaji Varadarajan > Reduce listing .hoodie folder when using timeline serve

[jira] [Updated] (HUDI-858) Allow multiple operations to be executed within a single commit

2020-05-15 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-858: Fix Version/s: (was: 0.5.2) (was: 0.5.1) 0.5.3

[jira] [Updated] (HUDI-846) Turn on incremental cleaning bu default in 0.6.0

2020-05-15 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-846: Fix Version/s: 0.5.3 > Turn on incremental cleaning bu default in 0.6.0 > ---

[jira] [Updated] (HUDI-848) Turn on embedded timeline server by default for all writes

2020-05-15 Thread Balaji Varadarajan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balaji Varadarajan updated HUDI-848: Fix Version/s: 0.5.3 > Turn on embedded timeline server by default for all writes > -

[GitHub] [incubator-hudi] bvaradar commented on pull request #1634: [WIP] [HUDI-846][HUDI-848] Enable Incremental cleaning and embedded timeline-server by default

2020-05-15 Thread GitBox
bvaradar commented on pull request #1634: URL: https://github.com/apache/incubator-hudi/pull/1634#issuecomment-629572248 @bhasudha : This is another important config change for 0.5.3. I am marking the PR as WIP till I get the tests to succeed. After that, I will make the PR active. -

[jira] [Updated] (HUDI-846) Turn on incremental cleaning bu default in 0.6.0

2020-05-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-846: Labels: pull-request-available (was: ) > Turn on incremental cleaning bu default in 0.6.0 >

[GitHub] [incubator-hudi] bvaradar opened a new pull request #1634: [WIP] [HUDI-846][HUDI-848] Enable Incremental cleaning and embedded timeline-server by default

2020-05-15 Thread GitBox
bvaradar opened a new pull request #1634: URL: https://github.com/apache/incubator-hudi/pull/1634 This is to enable timeline-server and incremental cleaning by default This is an automated message from the Apache Git Service.

[jira] [Updated] (HUDI-858) Allow multiple operations to be executed within a single commit

2020-05-15 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-858: Labels: pull-request-available (was: ) > Allow multiple operations to be executed within a single co

[GitHub] [incubator-hudi] bvaradar commented on pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-15 Thread GitBox
bvaradar commented on pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633#issuecomment-629571160 @bhasudha : FYI: This is needed for 0.5.3 (cc @vinothchandar ) This is an automated message from the Apa

[GitHub] [incubator-hudi] bvaradar opened a new pull request #1633: [HUDI-858] Allow multiple operations to be executed within a single commit

2020-05-15 Thread GitBox
bvaradar opened a new pull request #1633: URL: https://github.com/apache/incubator-hudi/pull/1633 There are users who had been directly using RDD APIs and have relied on a behavior in 0.4.x to allow multiple write operations (upsert/buk-insert/...) to be executed within a single commit.

[jira] [Updated] (HUDI-902) Avoid exception for getting SchemaProvider when no new input data

2020-05-15 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-902: Status: In Progress (was: Open) > Avoid exception for getting SchemaProvider when no new input data > --

[jira] [Created] (HUDI-902) Avoid exception for getting SchemaProvider when no new input data

2020-05-15 Thread Raymond Xu (Jira)
Raymond Xu created HUDI-902: --- Summary: Avoid exception for getting SchemaProvider when no new input data Key: HUDI-902 URL: https://issues.apache.org/jira/browse/HUDI-902 Project: Apache Hudi (incubating)

[jira] [Updated] (HUDI-902) Avoid exception for getting SchemaProvider when no new input data

2020-05-15 Thread Raymond Xu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-902: Status: Open (was: New) > Avoid exception for getting SchemaProvider when no new input data > --

[GitHub] [incubator-hudi] umehrot2 commented on pull request #1514: [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-05-15 Thread GitBox
umehrot2 commented on pull request #1514: URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-629568357 @afilipchik Seems like spark-avro schema convertor itself generates incorrect schema when we want to have **default value** as **null**. Is that the main concern address

[jira] [Commented] (HUDI-112) Supporting a Collapse type of operation

2020-05-15 Thread liwei (Jira)
[ https://issues.apache.org/jira/browse/HUDI-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108792#comment-17108792 ] liwei commented on HUDI-112: Hi, Nishith Agarwal [~nishith29] we  also meet this issue, in RFC

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1402: [HUDI-407] Adding Simple Index

2020-05-15 Thread GitBox
codecov-io edited a comment on pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402#issuecomment-619680608 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1402?src=pr&el=h1) Report > Merging [#1402](https://codecov.io/gh/apache/incubator-hudi/pull/14

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1402: [HUDI-407] Adding Simple Index

2020-05-15 Thread GitBox
codecov-io edited a comment on pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402#issuecomment-619680608 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1402?src=pr&el=h1) Report > Merging [#1402](https://codecov.io/gh/apache/incubator-hudi/pull/14

[GitHub] [incubator-hudi] nsivabalan commented on pull request #1402: [HUDI-407] Adding Simple Index

2020-05-15 Thread GitBox
nsivabalan commented on pull request #1402: URL: https://github.com/apache/incubator-hudi/pull/1402#issuecomment-629558212 Squashed all commits to one @vinothchandar This is an automated message from the Apache Git Service.

[GitHub] [incubator-hudi] nsivabalan commented on pull request #1514: [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-05-15 Thread GitBox
nsivabalan commented on pull request #1514: URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-629554509 @bvaradar @vinothchandar : adding null and default logic looks good to me. Do you folks suggest to create a new Schema altogether to have a neat solution or do it in p

[GitHub] [incubator-hudi] leesf commented on pull request #1622: [HUDI-888] fix NullPointerException

2020-05-15 Thread GitBox
leesf commented on pull request #1622: URL: https://github.com/apache/incubator-hudi/pull/1622#issuecomment-629554166 > cc @hddong in case this is a sign of some hardcode ports etc. I did look into the cli test code, and not found the hardcode ports and restarted the Travis three tim

[GitHub] [incubator-hudi] bvaradar commented on pull request #1524: [HUDI-801] Adding a way to post process schema after it is fetched

2020-05-15 Thread GitBox
bvaradar commented on pull request #1524: URL: https://github.com/apache/incubator-hudi/pull/1524#issuecomment-629547495 @afilipchik : Please take a look. This is an automated message from the Apache Git Service. To respond t

[GitHub] [incubator-hudi] afilipchik commented on a change in pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

2020-05-15 Thread GitBox
afilipchik commented on a change in pull request #1566: URL: https://github.com/apache/incubator-hudi/pull/1566#discussion_r426074416 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/SchemaSet.java ## @@ -0,0 +1,44 @@ +/* + * Licensed to the Apache S

[GitHub] [incubator-hudi] vingov commented on pull request #1632: [HUDI-783] Added python3 to the spark_base docker image to support pyspark

2020-05-15 Thread GitBox
vingov commented on pull request #1632: URL: https://github.com/apache/incubator-hudi/pull/1632#issuecomment-629521740 @bhasudha - As we discussed, I've followed the steps to test the docker images using local registry, check out the detailed testing report [here](https://gist.github.com/v

[GitHub] [incubator-hudi] vingov opened a new pull request #1632: Added python3 to the spark_base docker image to support pyspark

2020-05-15 Thread GitBox
vingov opened a new pull request #1632: URL: https://github.com/apache/incubator-hudi/pull/1632 ## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose o

[jira] [Commented] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-15 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108663#comment-17108663 ] Bhavani Sudha commented on HUDI-110: [~garyli1019] all yours. Re-assigned it to you. >

[jira] [Assigned] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-15 Thread Bhavani Sudha (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhavani Sudha reassigned HUDI-110: -- Assignee: Yanjia Gary Li (was: Bhavani Sudha Saktheeswaran) > Better defaults for Partition ext

[GitHub] [incubator-hudi] bvaradar commented on pull request #1514: [HUDI-774] Addressing incorrect Spark to Avro schema generation

2020-05-15 Thread GitBox
bvaradar commented on pull request #1514: URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-629480819 Also pinging @umehrot2 to get your help in reviewing this as you are familiar with this part. This is a

[GitHub] [incubator-hudi] afeldman1 commented on issue #933: Support for multiple level partitioning in Hudi

2020-05-15 Thread GitBox
afeldman1 commented on issue #933: URL: https://github.com/apache/incubator-hudi/issues/933#issuecomment-629478392 Similarly on whether we should add for the hive configuration, val HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY = "hoodie.datasource.hive_sync.partition_extractor_class"

[GitHub] [incubator-hudi] xushiyan commented on pull request #1584: fix schema provider issue

2020-05-15 Thread GitBox
xushiyan commented on pull request #1584: URL: https://github.com/apache/incubator-hudi/pull/1584#issuecomment-629472943 Ok @bvaradar thanks for checking. I shall be able to do it in the late afternoon. This is an automated

[GitHub] [incubator-hudi] bvaradar commented on pull request #1584: fix schema provider issue

2020-05-15 Thread GitBox
bvaradar commented on pull request #1584: URL: https://github.com/apache/incubator-hudi/pull/1584#issuecomment-629467290 @xushiyan : The idea and code changes looks good to me. Can you add a jira ticket and add an unit-test to include this change. It would be great if you could get this to

[GitHub] [incubator-hudi] afeldman1 commented on issue #933: Support for multiple level partitioning in Hudi

2020-05-15 Thread GitBox
afeldman1 commented on issue #933: URL: https://github.com/apache/incubator-hudi/issues/933#issuecomment-629455257 Thank you! That works. Should this be added to the org.apache.hudi DataSourceOptions.scala? Right now it has: /** * Key generator class, that implements will e

[GitHub] [incubator-hudi] bvaradar commented on pull request #1524: [HUDI-801] Adding a way to post process schema after it is fetched

2020-05-15 Thread GitBox
bvaradar commented on pull request #1524: URL: https://github.com/apache/incubator-hudi/pull/1524#issuecomment-629454559 Rebased to get the correct view of the diff This is an automated message from the Apache Git Service. To

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

2020-05-15 Thread GitBox
codecov-io edited a comment on pull request #1566: URL: https://github.com/apache/incubator-hudi/pull/1566#issuecomment-619623233 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1566?src=pr&el=h1) Report > Merging [#1566](https://codecov.io/gh/apache/incubator-hudi/pull/15

[GitHub] [incubator-hudi] bvaradar merged pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-15 Thread GitBox
bvaradar merged pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [incubator-hudi] bvaradar commented on pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-15 Thread GitBox
bvaradar commented on pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518#issuecomment-629445679 Going ahead and merging this change. This is an automated message from the Apache Git Service. To respo

[incubator-hudi] branch master updated: [HUDI-723] Register avro schema if infered from SQL transformation (#1518)

2020-05-15 Thread vbalaji
This is an automated email from the ASF dual-hosted git repository. vbalaji pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git The following commit(s) were added to refs/heads/master by this push: new 25e0b75 [HUDI-723] Register avro schem

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-15 Thread GitBox
bvaradar commented on a change in pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518#discussion_r426010971 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java ## @@ -296,10 +296,21 @@ private void refreshTi

[GitHub] [incubator-hudi] bvaradar commented on pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

2020-05-15 Thread GitBox
bvaradar commented on pull request #1566: URL: https://github.com/apache/incubator-hudi/pull/1566#issuecomment-629443388 @pratyakshsharma : Just rebased and did some cleanup. This is an automated message from the Apache Git S

[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

2020-05-15 Thread GitBox
pratyakshsharma commented on pull request #1566: URL: https://github.com/apache/incubator-hudi/pull/1566#issuecomment-629430390 > @pratyakshsharma : I updated this PR to address comments in the interest of reducing the review cycle time. I went through the changes. Looks good. I gues

[GitHub] [incubator-hudi] pratyakshsharma commented on pull request #1538: [HUDI-803]: added more test cases in TestHoodieAvroUtils.class

2020-05-15 Thread GitBox
pratyakshsharma commented on pull request #1538: URL: https://github.com/apache/incubator-hudi/pull/1538#issuecomment-629419001 @vinothchandar We can close this now :) This is an automated message from the Apache Git Service

[GitHub] [incubator-hudi] vinothchandar commented on issue #933: Support for multiple level partitioning in Hudi

2020-05-15 Thread GitBox
vinothchandar commented on issue #933: URL: https://github.com/apache/incubator-hudi/issues/933#issuecomment-629410586 @afeldman1 We change package names in 0.5.2. Can you please try `org.apache.hudi.keygen.ComplexKeyGenerator`

[GitHub] [incubator-hudi] afeldman1 commented on issue #933: Support for multiple level partitioning in Hudi

2020-05-15 Thread GitBox
afeldman1 commented on issue #933: URL: https://github.com/apache/incubator-hudi/issues/933#issuecomment-629409990 It looks like org.apache.hudi.ComplexKeyGenerator no longer exists. How can multiple columns be used as the partition columns now? ---

[jira] [Commented] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-15 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108528#comment-17108528 ] Yanjia Gary Li commented on HUDI-110: - Hi [~bhasudha] , I can pick up this ticket if no

[jira] [Resolved] (HUDI-528) Incremental Pull fails when latest commit is empty

2020-05-15 Thread Yanjia Gary Li (Jira)
[ https://issues.apache.org/jira/browse/HUDI-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanjia Gary Li resolved HUDI-528. - Resolution: Fixed > Incremental Pull fails when latest commit is empty > --

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-15 Thread GitBox
vinothchandar commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-629286244 @bhasudha this is a good 0.5.3 candidate.. if we are crunched on time, I can also write the test and push/merge tonight :) ---

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1622: [HUDI-888] fix NullPointerException

2020-05-15 Thread GitBox
vinothchandar commented on pull request #1622: URL: https://github.com/apache/incubator-hudi/pull/1622#issuecomment-629284848 cc @hddong in case this is a sign of some hardcode ports etc. This is an automated message from the

[GitHub] [incubator-hudi] yanghua commented on pull request #1558: [HUDI-796]: added deduping logic for upserts case

2020-05-15 Thread GitBox
yanghua commented on pull request #1558: URL: https://github.com/apache/incubator-hudi/pull/1558#issuecomment-629263967 @pratyakshsharma still conflicting files This is an automated message from the Apache Git Service. To res

[GitHub] [incubator-hudi] EdwinGuo closed issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-15 Thread GitBox
EdwinGuo closed issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [incubator-hudi] EdwinGuo commented on issue #1630: [SUPPORT] Latest commit does not have any schema in commit metadata

2020-05-15 Thread GitBox
EdwinGuo commented on issue #1630: URL: https://github.com/apache/incubator-hudi/issues/1630#issuecomment-629221132 Ok, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [incubator-hudi] wangxianghu commented on a change in pull request #1409: [HUDI-714]Add javadoc and comments to hudi write method link

2020-05-15 Thread GitBox
wangxianghu commented on a change in pull request #1409: URL: https://github.com/apache/incubator-hudi/pull/1409#discussion_r425775385 ## File path: hudi-spark/src/main/java/org/apache/hudi/DataSourceUtils.java ## @@ -241,6 +241,13 @@ public static HoodieRecord createHoodieRec

[jira] [Updated] (HUDI-901) Bug Bash 0.6.0 Tracking Ticket

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-901: - Fix Version/s: 0.6.0 > Bug Bash 0.6.0 Tracking Ticket > -- > >

[jira] [Assigned] (HUDI-901) Bug Bash 0.6.0 Tracking Ticket

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-901: Assignee: sivabalan narayanan > Bug Bash 0.6.0 Tracking Ticket > ---

[jira] [Created] (HUDI-901) Bug Bash 0.6.0 Tracking Ticket

2020-05-15 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-901: Summary: Bug Bash 0.6.0 Tracking Ticket Key: HUDI-901 URL: https://issues.apache.org/jira/browse/HUDI-901 Project: Apache Hudi (incubating) Issue Typ

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-15 Thread GitBox
codecov-io edited a comment on pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518#issuecomment-629195380 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1518?src=pr&el=h1) Report > Merging [#1518](https://codecov.io/gh/apache/incubator-hudi/pull/15

[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1409: [HUDI-714]Add javadoc and comments to hudi write method link

2020-05-15 Thread GitBox
codecov-io edited a comment on pull request #1409: URL: https://github.com/apache/incubator-hudi/pull/1409#issuecomment-599323873 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1409?src=pr&el=h1) Report > Merging [#1409](https://codecov.io/gh/apache/incubator-hudi/pull/14

[GitHub] [incubator-hudi] codecov-io commented on pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-15 Thread GitBox
codecov-io commented on pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518#issuecomment-629195380 # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1518?src=pr&el=h1) Report > Merging [#1518](https://codecov.io/gh/apache/incubator-hudi/pull/1518?src=

[jira] [Assigned] (HUDI-863) nested structs containing decimal types lead to null pointer exception

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-863: Assignee: Roland Johann > nested structs containing decimal types lead to null point

[jira] [Assigned] (HUDI-722) IndexOutOfBoundsException in MessageColumnIORecordConsumer.addBinary when writing parquet

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-722: Assignee: sivabalan narayanan (was: lamber-ken) > IndexOutOfBoundsException in Mess

[jira] [Assigned] (HUDI-767) Support transformation when export to Hudi

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-767: Assignee: Raymond Xu > Support transformation when export to Hudi >

[jira] [Commented] (HUDI-859) Improve documentation around key generators

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108208#comment-17108208 ] sivabalan narayanan commented on HUDI-859: -- [~hongdongdong]: discuss with [~Pratya

[jira] [Assigned] (HUDI-859) Improve documentation around key generators

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-859: Assignee: hong dongdong (was: Pratyaksh Sharma) > Improve documentation around key

[jira] [Assigned] (HUDI-13) Clarify whether the hoodie-hadoop-mr jars need to be rolled out across Hive cluster #553

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-13: --- Assignee: sivabalan narayanan > Clarify whether the hoodie-hadoop-mr jars need to be ro

[jira] [Assigned] (HUDI-4) Support for writing to EMRFS

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-4: -- Assignee: vinoyang > Support for writing to EMRFS > > >

[jira] [Assigned] (HUDI-13) Clarify whether the hoodie-hadoop-mr jars need to be rolled out across Hive cluster #553

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-13: --- Assignee: vinoyang (was: sivabalan narayanan) > Clarify whether the hoodie-hadoop-mr j

[jira] [Assigned] (HUDI-395) hudi does not support scheme s3n when wrtiing to S3

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-395: Assignee: sivabalan narayanan (was: Raymond Xu) > hudi does not support scheme s3n

[jira] [Assigned] (HUDI-303) Avro schema case sensitivity testing

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-303: Assignee: Udit Mehrotra > Avro schema case sensitivity testing > ---

[jira] [Assigned] (HUDI-307) Dataframe written with Date,Timestamp, Decimal is read with same types

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-307: Assignee: Udit Mehrotra > Dataframe written with Date,Timestamp, Decimal is read wit

[jira] [Assigned] (HUDI-110) Better defaults for Partition extractor for Spark DataSOurce and DeltaStreamer

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-110: Assignee: Bhavani Sudha Saktheeswaran > Better defaults for Partition extractor for

[jira] [Commented] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108204#comment-17108204 ] sivabalan narayanan commented on HUDI-494: -- Assigning the ticket to lamber ken. Bu

[jira] [Assigned] (HUDI-473) IllegalArgumentException in QuickstartUtils

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-473: Assignee: Bhavani Sudha Saktheeswaran > IllegalArgumentException in QuickstartUtils

[jira] [Assigned] (HUDI-494) [DEBUGGING] Huge amount of tasks when writing files into HDFS

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-494: Assignee: lamber-ken (was: Yanjia Gary Li) > [DEBUGGING] Huge amount of tasks when

[jira] [Assigned] (HUDI-723) SqlTransformer's schema sometimes is not registered.

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-723: Assignee: hong dongdong > SqlTransformer's schema sometimes is not registered. > --

[jira] [Assigned] (HUDI-867) Graphite metrics are throwing IllegalArgumentException on continuous mode

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-867: Assignee: Raymond Xu > Graphite metrics are throwing IllegalArgumentException on con

[jira] [Assigned] (HUDI-395) hudi does not support scheme s3n when wrtiing to S3

2020-05-15 Thread sivabalan narayanan (Jira)
[ https://issues.apache.org/jira/browse/HUDI-395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan reassigned HUDI-395: Assignee: Raymond Xu (was: leesf) > hudi does not support scheme s3n when wrtiing t

[GitHub] [incubator-hudi] umehrot2 commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-15 Thread GitBox
umehrot2 commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-629173116 @rolandjohann The fix makes sense to me. Lets add a test for decimal type handling, and make it nested within another type as well. Recently we have tried to exhaustivel

[GitHub] [incubator-hudi] rolandjohann edited a comment on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-15 Thread GitBox
rolandjohann edited a comment on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-629170944 @vinothchandar @umehrot2 is right: only when the field is not at level. This happened because the avro schema has been passed to each recursion of the metho

[GitHub] [incubator-hudi] rolandjohann commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-15 Thread GitBox
rolandjohann commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-629170944 @vinothchandar @umehrot2 is right: only when the field is not at level. This happened because the avro schema has been passed to each recursion of the method, but

[GitHub] [incubator-hudi] umehrot2 commented on pull request #1596: [HUDI-863] get decimal properties from derived spark DataType

2020-05-15 Thread GitBox
umehrot2 commented on pull request #1596: URL: https://github.com/apache/incubator-hudi/pull/1596#issuecomment-629169708 > LGTM overall.. If you can throw in a test, like you mentioned, that'd be great. > > Also trying to understand the scope of the issue.. without this, does every

[GitHub] [incubator-hudi] bvaradar commented on pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

2020-05-15 Thread GitBox
bvaradar commented on pull request #1566: URL: https://github.com/apache/incubator-hudi/pull/1566#issuecomment-629167984 @pratyakshsharma : I updated this PR to address comments in the interest of reducing the review cycle time. ---

[GitHub] [incubator-hudi] rolandjohann commented on pull request #1622: [HUDI-888] fix NullPointerException

2020-05-15 Thread GitBox
rolandjohann commented on pull request #1622: URL: https://github.com/apache/incubator-hudi/pull/1622#issuecomment-629165710 Is it possible that this is a test infrastructure related issue? ``` [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 7.416 s <<< FAILURE

[GitHub] [incubator-hudi] yanghua commented on pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-05-15 Thread GitBox
yanghua commented on pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#issuecomment-629096996 @n3nash conflicting files. This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [incubator-hudi] rolandjohann commented on issue #1625: [SUPPORT] MOR upsert table grows in size when ingesting same records

2020-05-15 Thread GitBox
rolandjohann commented on issue #1625: URL: https://github.com/apache/incubator-hudi/issues/1625#issuecomment-629095378 @bvaradar After 15 runs the filesystem looks like this: ```bash $ tree -a /tmp/visitors_hudi_mor/

[GitHub] [incubator-hudi] bvaradar commented on a change in pull request #1518: [HUDI-723] Register avro schema if infered from SQL transformation

2020-05-15 Thread GitBox
bvaradar commented on a change in pull request #1518: URL: https://github.com/apache/incubator-hudi/pull/1518#discussion_r425626686 ## File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java ## @@ -460,8 +471,17 @@ private void syncHive()

[GitHub] [incubator-hudi] lamber-ken commented on pull request #1151: [HUDI-476] Add hudi-examples module

2020-05-15 Thread GitBox
lamber-ken commented on pull request #1151: URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-629076102 > > Can you confirm if you have run these examples locally once and verified the instructions work? > > @vinothchandar , I ran these examples locally and ensured

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-05-15 Thread GitBox
n3nash commented on a change in pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#discussion_r425613654 ## File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/DeltaWriterFactory.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache So

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-05-15 Thread GitBox
n3nash commented on a change in pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#discussion_r425613491 ## File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/dag/nodes/InsertNode.java ## @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1100: [HUDI-289] Implement a test suite to support long running test for Hudi writing and querying end-end

2020-05-15 Thread GitBox
n3nash commented on a change in pull request #1100: URL: https://github.com/apache/incubator-hudi/pull/1100#discussion_r425613413 ## File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/converter/UpdateConverter.java ## @@ -0,0 +1,56 @@ +/* + * Licensed to the Ap

  1   2   >