Build failed in Jenkins: hudi-snapshot-deployment-0.5 #369

2020-08-13 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.58 KB...] cdi-api-1.0.jar cdi-api.license commons-cli-1.4.jar commons-cli.license commons-io-2.5.jar commons-io.license

[jira] [Updated] (HUDI-1190) Annotate all public APIs classes with stability indication

2020-08-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1190: - Labels: pull-request-available (was: ) > Annotate all public APIs classes with stability

[GitHub] [hudi] vinothchandar opened a new pull request #1965: [HUDI-1190] Introduce @PublicAPIClass and @PublicAPIMethod annotations to mark public APIs

2020-08-13 Thread GitBox
vinothchandar opened a new pull request #1965: URL: https://github.com/apache/hudi/pull/1965 - Maturity levels one of : evolving, stable, deprecated - Took a pass and marked out most of the existing public API ## *Tips* - *Thank you very much for contributing to Apache Hudi.*

[GitHub] [hudi] bhasudha commented on issue #1961: [SUPPORT] Jetty Not able to find method java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V on Databricks c

2020-08-13 Thread GitBox
bhasudha commented on issue #1961: URL: https://github.com/apache/hudi/issues/1961#issuecomment-673892377 > Not able to join the channel as I don't have any email id with the mentioned domain. Can you help me to get in? @nsivabalan

[GitHub] [hudi] saumyasuhagiya commented on issue #1961: [SUPPORT] Jetty Not able to find method java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V on Databr

2020-08-13 Thread GitBox
saumyasuhagiya commented on issue #1961: URL: https://github.com/apache/hudi/issues/1961#issuecomment-673890895 Not able to join the channel as I don't have any email id with the mentioned domain. Can you help me to get in? @nsivabalan

[GitHub] [hudi] shenh062326 commented on pull request #1868: [HUDI-1083] Optimization in determining insert bucket location for a given key

2020-08-13 Thread GitBox
shenh062326 commented on pull request #1868: URL: https://github.com/apache/hudi/pull/1868#issuecomment-673855323 @nsivabalan Can you take a look at this PR? This is an automated message from the Apache Git Service. To

[GitHub] [hudi] satishkotha opened a new pull request #1964: [HUDI-1191] Add incremental meta client API to query partitions changed

2020-08-13 Thread GitBox
satishkotha opened a new pull request #1964: URL: https://github.com/apache/hudi/pull/1964 ## What is the purpose of the pull request Add IncrementalMetaClient as separate class to query partitions affected in a specified time window ## Brief change log - Add

[jira] [Updated] (HUDI-1191) create incremental meta client abstraction to query modified partitions

2020-08-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1191: - Labels: pull-request-available (was: ) > create incremental meta client abstraction to query

[jira] [Assigned] (HUDI-1191) create incremental meta client abstraction to query modified partitions

2020-08-13 Thread satish (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] satish reassigned HUDI-1191: Assignee: satish > create incremental meta client abstraction to query modified partitions >

[jira] [Created] (HUDI-1191) create incremental meta client abstraction to query modified partitions

2020-08-13 Thread satish (Jira)
satish created HUDI-1191: Summary: create incremental meta client abstraction to query modified partitions Key: HUDI-1191 URL: https://issues.apache.org/jira/browse/HUDI-1191 Project: Apache Hudi

[jira] [Updated] (HUDI-1190) Annotate all public APIs classes with stability indication

2020-08-13 Thread Vinoth Chandar (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Chandar updated HUDI-1190: - Status: Open (was: New) > Annotate all public APIs classes with stability indication >

[jira] [Created] (HUDI-1190) Annotate all public APIs classes with stability indication

2020-08-13 Thread Vinoth Chandar (Jira)
Vinoth Chandar created HUDI-1190: Summary: Annotate all public APIs classes with stability indication Key: HUDI-1190 URL: https://issues.apache.org/jira/browse/HUDI-1190 Project: Apache Hudi

[jira] [Assigned] (HUDI-1184) Support updatePartitionPath for HBaseIndex

2020-08-13 Thread Udit Mehrotra (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udit Mehrotra reassigned HUDI-1184: --- Assignee: Ryan Pifer > Support updatePartitionPath for HBaseIndex >

[jira] [Updated] (HUDI-1188) MOR hbase index tables not deduplicating records

2020-08-13 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HUDI-1188: - Labels: pull-request-available (was: ) > MOR hbase index tables not deduplicating records >

[GitHub] [hudi] rmpifer opened a new pull request #1963: [HUDI-1188] Hbase index MOR tables records not being deduplicated

2020-08-13 Thread GitBox
rmpifer opened a new pull request #1963: URL: https://github.com/apache/hudi/pull/1963 ## What is the purpose of the pull request After fetching hbase index for a record, Hudi performs validation that the commit timestamp stored in hbase for that record is a `commit` on the

[jira] [Created] (HUDI-1189) Change in UserDefinedBulkInsertPartitioner

2020-08-13 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1189: - Summary: Change in UserDefinedBulkInsertPartitioner Key: HUDI-1189 URL: https://issues.apache.org/jira/browse/HUDI-1189 Project: Apache Hudi Issue

[GitHub] [hudi] nsivabalan commented on issue #1911: [SUPPORT] GLOBAL_BLOOM index errors on Upsert operation

2020-08-13 Thread GitBox
nsivabalan commented on issue #1911: URL: https://github.com/apache/hudi/issues/1911#issuecomment-673747835 Can you try w/ spark datasource, as you see in quick start utils and let us know if you could reproduce the issue.

[GitHub] [hudi] nsivabalan edited a comment on issue #1961: [SUPPORT] Jetty Not able to find method java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V on Dat

2020-08-13 Thread GitBox
nsivabalan edited a comment on issue #1961: URL: https://github.com/apache/hudi/issues/1961#issuecomment-673696662 I don't have any exp in Azure databricks. Can you post it in [hudi's slack channel](https://github.com/apache/hudi/issues/1961#issuecomment-673696662). someone with

[GitHub] [hudi] nsivabalan commented on issue #1961: [SUPPORT] Jetty Not able to find method java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V on Databricks

2020-08-13 Thread GitBox
nsivabalan commented on issue #1961: URL: https://github.com/apache/hudi/issues/1961#issuecomment-673696662 I don't have any exp in Azure databricks. Can you post it in hudi's slack channel. someone with experience might help.

[GitHub] [hudi] nsivabalan commented on issue #1962: [SUPPORT] Unable to filter hudi table in hive on partition column

2020-08-13 Thread GitBox
nsivabalan commented on issue #1962: URL: https://github.com/apache/hudi/issues/1962#issuecomment-673694878 Did you set hive input format ? Also can you confirm you settings given [here](https://hudi.apache.org/docs/docker_demo.html#step-4-a-run-hive-queries) are set.

[GitHub] [hudi] brandon-stanley edited a comment on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
brandon-stanley edited a comment on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673462785 @bhasudha Thanks for the response. Does the precombine field have to be a non-nullable field/column as well? My dataset may have duplicates but I have implemented custom

[jira] [Created] (HUDI-1188) MOR hbase index tables not deduplicating records

2020-08-13 Thread Ryan Pifer (Jira)
Ryan Pifer created HUDI-1188: Summary: MOR hbase index tables not deduplicating records Key: HUDI-1188 URL: https://issues.apache.org/jira/browse/HUDI-1188 Project: Apache Hudi Issue Type: Bug

[GitHub] [hudi] brandon-stanley edited a comment on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
brandon-stanley edited a comment on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673462785 @bhasudha Thanks for the response. Does the precombine field have to be a non-nullable field/column as well? My dataset may have duplicates but I have implemented custom

[GitHub] [hudi] brandon-stanley edited a comment on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
brandon-stanley edited a comment on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673462785 @bhasudha Thanks for the response. Does the precombine field have to be a non-nullable field/column as well? My dataset may have duplicates but I have implemented custom

[GitHub] [hudi] tooptoop4 commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

2020-08-13 Thread GitBox
tooptoop4 commented on issue #1948: URL: https://github.com/apache/hudi/issues/1948#issuecomment-673519517 I use hoodie-conf as shown in the description, but is property file mandatory? This is an automated message from the

[GitHub] [hudi] sassai opened a new issue #1962: [SUPPORT] Unable to filter hudi table in hive on partition column

2020-08-13 Thread GitBox
sassai opened a new issue #1962: URL: https://github.com/apache/hudi/issues/1962 **Describe the problem you faced** I'm running a spark structured streaming application that reads data from kafka and saves it to a partitioned Hudi MERGE_ON_READ table. Hive sync is enabled and I'm

[GitHub] [hudi] saumyasuhagiya opened a new issue #1961: [SUPPORT] Jetty Not able to find method java.lang.NoSuchMethodError: org.eclipse.jetty.server.session.SessionHandler.setHttpOnly(Z)V on Databr

2020-08-13 Thread GitBox
saumyasuhagiya opened a new issue #1961: URL: https://github.com/apache/hudi/issues/1961 **_Tips before filing an issue_** - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)? - Join the mailing list to engage in conversations and get

[jira] [Created] (HUDI-1187) Improvements/Follow up on Bulk Insert V2

2020-08-13 Thread sivabalan narayanan (Jira)
sivabalan narayanan created HUDI-1187: - Summary: Improvements/Follow up on Bulk Insert V2 Key: HUDI-1187 URL: https://issues.apache.org/jira/browse/HUDI-1187 Project: Apache Hudi Issue

[GitHub] [hudi] brandon-stanley edited a comment on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
brandon-stanley edited a comment on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673462785 @bhasudha Thanks for the response. Does the precombine field have to be a non-nullable field/column as well? My dataset may have duplicates but I have implemented custom

[GitHub] [hudi] brandon-stanley commented on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
brandon-stanley commented on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673462785 @bhasudha Thanks for the response. Does the precombine field have to be a non-nullable field/column as well? My dataset may have duplicates but I have implemented custom logic

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-13 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r469922118 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -85,71 +84,40 @@ public final HoodieKey

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-13 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r469922118 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -85,71 +84,40 @@ public final HoodieKey

[GitHub] [hudi] nsivabalan commented on a change in pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-13 Thread GitBox
nsivabalan commented on a change in pull request #1834: URL: https://github.com/apache/hudi/pull/1834#discussion_r469922118 ## File path: hudi-spark/src/main/java/org/apache/hudi/keygen/BuiltinKeyGenerator.java ## @@ -85,71 +84,40 @@ public final HoodieKey

[jira] [Updated] (HUDI-909) Introduce hudi-client-flink module to support flink engine

2020-08-13 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-909: - Fix Version/s: (was: 0.6.0) 0.6.1 > Introduce hudi-client-flink module to support

[jira] [Updated] (HUDI-1150) Fix unable to parse input partition field :1 exception when using TimestampBasedKeyGenerator

2020-08-13 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1150: -- Status: Open (was: New) > Fix unable to parse input partition field :1 exception when using >

[jira] [Updated] (HUDI-1150) Fix unable to parse input partition field :1 exception when using TimestampBasedKeyGenerator

2020-08-13 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1150: -- Status: In Progress (was: Open) > Fix unable to parse input partition field :1 exception when using >

[jira] [Updated] (HUDI-1089) Refactor hudi-client to support multi-engine

2020-08-13 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1089: -- Fix Version/s: (was: 0.6.0) 0.6.1 > Refactor hudi-client to support multi-engine

[jira] [Updated] (HUDI-1150) Fix unable to parse input partition field :1 exception when using TimestampBasedKeyGenerator

2020-08-13 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1150: -- Fix Version/s: (was: 0.6.0) 0.6.1 > Fix unable to parse input partition field :1

[jira] [Created] (HUDI-1186) Add description of write commit callback by kafka to document

2020-08-13 Thread wangxianghu (Jira)
wangxianghu created HUDI-1186: - Summary: Add description of write commit callback by kafka to document Key: HUDI-1186 URL: https://issues.apache.org/jira/browse/HUDI-1186 Project: Apache Hudi

[jira] [Updated] (HUDI-1122) Introduce a kafka implementation of hoodie write commit callback

2020-08-13 Thread wangxianghu (Jira)
[ https://issues.apache.org/jira/browse/HUDI-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangxianghu updated HUDI-1122: -- Fix Version/s: (was: 0.6.0) 0.6.1 > Introduce a kafka implementation of hoodie

[GitHub] [hudi] bhasudha commented on issue #1948: [SUPPORT] DMS example complains about dfs-source.properties

2020-08-13 Thread GitBox
bhasudha commented on issue #1948: URL: https://github.com/apache/hudi/issues/1948#issuecomment-673347421 You would need to set the `--props` config for DeltaStreamer with a valid property file -

[GitHub] [hudi] bhasudha commented on issue #1956: [SUPPORT] DMS for table without PK

2020-08-13 Thread GitBox
bhasudha commented on issue #1956: URL: https://github.com/apache/hudi/issues/1956#issuecomment-673341610 > so on single column table it was https://github.com/apache/hudi/blob/release-0.5.3/hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java#L58 > > can I use

[GitHub] [hudi] bhasudha edited a comment on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
bhasudha edited a comment on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673336838 @brandon-stanley the `hoodie.datasource.write.precombine.field` is a mandatory field. If not specified a default field name `ts` is assumed. Since your table does not have

[GitHub] [hudi] bhasudha commented on issue #1960: How do you change the 'hoodie.datasource.write.payload.class' configuration property?

2020-08-13 Thread GitBox
bhasudha commented on issue #1960: URL: https://github.com/apache/hudi/issues/1960#issuecomment-673336838 @brandon-stanley the `hoodie.datasource.write.precombine.field` is a mandatory field. If not specified a default field name `ts` is assumed. Since your table does not have this field

[hudi] branch asf-site updated: Travis CI build asf-site

2020-08-13 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new ac4e0c3 Travis CI build asf-site ac4e0c3 is

[hudi] branch asf-site updated: [HUDI-859]: Added section for key generation in writing data docs (#1816)

2020-08-13 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new 3df2674 [HUDI-859]: Added section for

[GitHub] [hudi] bhasudha merged pull request #1816: [HUDI-859]: Added section for key generation in writing data docs

2020-08-13 Thread GitBox
bhasudha merged pull request #1816: URL: https://github.com/apache/hudi/pull/1816 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] xushiyan commented on issue #1947: datadog monitor hudi

2020-08-13 Thread GitBox
xushiyan commented on issue #1947: URL: https://github.com/apache/hudi/issues/1947#issuecomment-673322896 hi i haven't tried this myself but a cursory look gives that `option("hoodie.metrics.on",true)` may be a problem as it takes a `boolean` for value. can you try

[GitHub] [hudi] vinothchandar merged pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-13 Thread GitBox
vinothchandar merged pull request #1834: URL: https://github.com/apache/hudi/pull/1834 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [hudi] vinothchandar commented on pull request #1834: [HUDI-1013] Adding Bulk Insert V2 implementation

2020-08-13 Thread GitBox
vinothchandar commented on pull request #1834: URL: https://github.com/apache/hudi/pull/1834#issuecomment-673310825 @nsivabalan this is ready. I am going ahead and merging. I also re-ran the benchmark again . Seems to clock the same 30 mins against spark.write.parquet. Please

[GitHub] [hudi] saumyasuhagiya commented on issue #827: java.lang.ClassNotFoundException: com.uber.hoodie.hadoop.HoodieInputFormat

2020-08-13 Thread GitBox
saumyasuhagiya commented on issue #827: URL: https://github.com/apache/hudi/issues/827#issuecomment-673310364 Thanks @bvaradar. Currently I have resolved it by putting as external jar using --jars. I will open new issue if required

[GitHub] [hudi] bhasudha commented on a change in pull request #1816: [HUDI-859]: Added section for key generation in writing data docs

2020-08-13 Thread GitBox
bhasudha commented on a change in pull request #1816: URL: https://github.com/apache/hudi/pull/1816#discussion_r469733389 ## File path: docs/_docs/2_2_writing_data.md ## @@ -28,6 +28,58 @@ can be chosen/changed across each commit/deltacommit issued against the table. of

[hudi] branch asf-site updated: Travis CI build asf-site

2020-08-13 Thread vinoth
This is an automated email from the ASF dual-hosted git repository. vinoth pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new b3d0f15 Travis CI build asf-site b3d0f15 is

Build failed in Jenkins: hudi-snapshot-deployment-0.5 #368

2020-08-13 Thread Apache Jenkins Server
See Changes: -- [...truncated 2.59 KB...] cdi-api-1.0.jar cdi-api.license commons-cli-1.4.jar commons-cli.license commons-io-2.5.jar commons-io.license

[hudi] branch asf-site updated: [HUDI-766]: added section for HoodieMultiTableDeltaStreamer (#1822)

2020-08-13 Thread bhavanisudha
This is an automated email from the ASF dual-hosted git repository. bhavanisudha pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git The following commit(s) were added to refs/heads/asf-site by this push: new bdad1bf [HUDI-766]: added section for

[GitHub] [hudi] bhasudha merged pull request #1822: [HUDI-766]: added section for HoodieMultiTableDeltaStreamer

2020-08-13 Thread GitBox
bhasudha merged pull request #1822: URL: https://github.com/apache/hudi/pull/1822 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [hudi] bhasudha commented on a change in pull request #1822: [HUDI-766]: added section for HoodieMultiTableDeltaStreamer

2020-08-13 Thread GitBox
bhasudha commented on a change in pull request #1822: URL: https://github.com/apache/hudi/pull/1822#discussion_r469724927 ## File path: docs/_docs/2_2_writing_data.md ## @@ -174,6 +174,42 @@ and then ingest it as follows. In some cases, you may want to migrate your existing

[GitHub] [hudi] tooptoop4 commented on issue #1956: [SUPPORT] DMS for table without PK

2020-08-13 Thread GitBox
tooptoop4 commented on issue #1956: URL: https://github.com/apache/hudi/issues/1956#issuecomment-673284248 so on single column table it was https://github.com/apache/hudi/blob/release-0.5.3/hudi-spark/src/main/java/org/apache/hudi/keygen/SimpleKeyGenerator.java#L58 can I use