[jira] [Created] (HUDI-6009) Let the jetty server in TimelineService create daemon threads

2023-03-30 Thread dzcxzl (Jira)
dzcxzl created HUDI-6009:


 Summary: Let the jetty server in TimelineService create daemon 
threads
 Key: HUDI-6009
 URL: https://issues.apache.org/jira/browse/HUDI-6009
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: dzcxzl






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5731) Add guava dependency to Spark and MR bundle

2023-02-08 Thread dzcxzl (Jira)
dzcxzl created HUDI-5731:


 Summary: Add guava dependency to Spark and MR bundle
 Key: HUDI-5731
 URL: https://issues.apache.org/jira/browse/HUDI-5731
 Project: Apache Hudi
  Issue Type: Bug
Affects Versions: 0.12.1
Reporter: dzcxzl


Configure guava relocation in Spark and MR bundle pom.xml, but there is no 
guava dependency, resulting in failure of guava-related class loading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5487) Reduce duplicate Logs in ExternalSpillableMap

2022-12-28 Thread dzcxzl (Jira)
dzcxzl created HUDI-5487:


 Summary: Reduce duplicate Logs in ExternalSpillableMap
 Key: HUDI-5487
 URL: https://issues.apache.org/jira/browse/HUDI-5487
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: dzcxzl


We see hundreds of thousands of duplicate logs in the executor log.
{code:java}
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567
22/12/26 21:13:40,864 [Executor task launch worker for task 0.0 in stage 480.0 
(TID 211376)] INFO ExternalSpillableMap: Update Estimated Payload size to => 
4567 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5484) Avoid using GenericRecord in ColumnStatMetadata

2022-12-28 Thread dzcxzl (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated HUDI-5484:
-
Description: 
 

 
{code:java}
org.apache.hudi.com.esotericsoftware.kryo.KryoException: 
java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
maxValue (org.apache.hudi.avro.model.HoodieMetadataColumnStats)
columnStatMetadata (org.apache.hudi.metadata.HoodieMetadataPayload)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
    
at 
org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:232)
    at 
org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:45)
    at org.apache.hudi.common.model.HoodieRecord.read(HoodieRecord.java:339)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:520)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:512)
    at 
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
    at 
org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:101)
    at 
org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:75)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:210)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:203)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:199)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:68)
    at 
org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:195)
    at 
org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:54)
    at org.apache.hudi.io.HoodieCreateHandle.write(HoodieCreateHandle.java:188)
    at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.handleInsert(HoodieSparkCopyOnWriteTable.java:257)
    at 
org.apache.hudi.table.action.compact.CompactionExecutionHelper.writeFileAndGetWriteStats(CompactionExecutionHelper.java:68)
    at 
org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:231)
    at 
org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$9cd4b1be$1(HoodieCompactor.java:129)
    at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)Caused
 by: java.lang.UnsupportedOperationException
    at java.util.Collections$UnmodifiableCollection.add(Collections.java:1055)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
    at org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125){code}
 

  was:
 

 
{code:java}
org.apache.hudi.com.esotericsoftware.kryo.KryoException: 
java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
maxValue (org.apache.hudi.avro.model.HoodieMetadataColumnStats)
columnStatMetadata (org.apache.hudi.metadata.HoodieMetadataPayload)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
    at 
org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:232)
    at 
org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:45)
    at org.apache.hudi.common.model.HoodieRecord.read(HoodieRecord.java:339)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:520)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:512)
    at 
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
    at 
org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:101)
    at 
org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:75)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:210)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:203)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:199)
    at 

[jira] [Created] (HUDI-5484) Avoid using GenericRecord in ColumnStatMetadata

2022-12-28 Thread dzcxzl (Jira)
dzcxzl created HUDI-5484:


 Summary: Avoid using GenericRecord in ColumnStatMetadata
 Key: HUDI-5484
 URL: https://issues.apache.org/jira/browse/HUDI-5484
 Project: Apache Hudi
  Issue Type: Bug
Reporter: dzcxzl


 

 
{code:java}
org.apache.hudi.com.esotericsoftware.kryo.KryoException: 
java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
maxValue (org.apache.hudi.avro.model.HoodieMetadataColumnStats)
columnStatMetadata (org.apache.hudi.metadata.HoodieMetadataPayload)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144)
    at 
org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:232)
    at 
org.apache.hudi.common.model.HoodieAvroRecord.readRecordPayload(HoodieAvroRecord.java:45)
    at org.apache.hudi.common.model.HoodieRecord.read(HoodieRecord.java:339)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:520)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.DefaultSerializers$KryoSerializableSerializer.read(DefaultSerializers.java:512)
    at 
org.apache.hudi.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
    at 
org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:101)
    at 
org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:75)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:210)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:203)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:199)
    at 
org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:68)
    at 
org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:195)
    at 
org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:54)
    at org.apache.hudi.io.HoodieCreateHandle.write(HoodieCreateHandle.java:188)
    at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.handleInsert(HoodieSparkCopyOnWriteTable.java:257)
    at 
org.apache.hudi.table.action.compact.CompactionExecutionHelper.writeFileAndGetWriteStats(CompactionExecutionHelper.java:68)
    at 
org.apache.hudi.table.action.compact.HoodieCompactor.compact(HoodieCompactor.java:231)
    at 
org.apache.hudi.table.action.compact.HoodieCompactor.lambda$compact$9cd4b1be$1(HoodieCompactor.java:129)
    at 
org.apache.spark.api.java.JavaPairRDD$.$anonfun$toScalaFunction$1(JavaPairRDD.scala:1070)Caused
 by: java.lang.UnsupportedOperationException
    at java.util.Collections$UnmodifiableCollection.add(Collections.java:1055)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
    at org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
    at 
org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
 {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5378) Remove minlog.Log

2022-12-12 Thread dzcxzl (Jira)
dzcxzl created HUDI-5378:


 Summary: Remove minlog.Log
 Key: HUDI-5378
 URL: https://issues.apache.org/jira/browse/HUDI-5378
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: dzcxzl






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5356) Call close on SparkRDDWriteClient several places

2022-12-09 Thread dzcxzl (Jira)
dzcxzl created HUDI-5356:


 Summary: Call close on SparkRDDWriteClient several places
 Key: HUDI-5356
 URL: https://issues.apache.org/jira/browse/HUDI-5356
 Project: Apache Hudi
  Issue Type: Bug
Reporter: dzcxzl






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-5332) HiveSyncTool can avoid initializing all permanent custom functions of Hive

2022-12-05 Thread dzcxzl (Jira)
dzcxzl created HUDI-5332:


 Summary: HiveSyncTool can avoid initializing all permanent custom 
functions of Hive
 Key: HUDI-5332
 URL: https://issues.apache.org/jira/browse/HUDI-5332
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: dzcxzl


Now initializing the Hive client will load the permanent custom functions of 
HMS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HUDI-4316) Support for spillable diskmap configuration when constructing HoodieMergedLogRecordScanner

2022-06-24 Thread dzcxzl (Jira)
dzcxzl created HUDI-4316:


 Summary: Support for spillable diskmap configuration when 
constructing HoodieMergedLogRecordScanner
 Key: HUDI-4316
 URL: https://issues.apache.org/jira/browse/HUDI-4316
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: dzcxzl


Several PRs have supported using the configuration 
hoodie.common.spillable.diskmap.type to decide which diskmap to use before 
constructing the HoodieMergedLogRecordScanner
But there are several places where such a configuration is not supported.

 

https://issues.apache.org/jira/browse/HUDI-2044

https://issues.apache.org/jira/browse/HUDI-4151

 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HUDI-4309) Spark3.2 custom parser should not throw exception

2022-06-23 Thread dzcxzl (Jira)
dzcxzl created HUDI-4309:


 Summary: Spark3.2 custom parser should not throw exception
 Key: HUDI-4309
 URL: https://issues.apache.org/jira/browse/HUDI-4309
 Project: Apache Hudi
  Issue Type: Bug
Reporter: dzcxzl


In HoodieSpark3_2ExtendedSqlAstBuilder, there are three places where the syntax 
is not supported, which causes sql to report an error in the parse stage.

 

visitInsertOverwriteDir

visitInsertOverwriteHiveDir

withRepartitionByExpression



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HUDI-4111) Bump ANTLR runtime version in Spark 3.x

2022-05-18 Thread dzcxzl (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated HUDI-4111:
-
Summary: Bump ANTLR runtime version in Spark 3.x  (was: Bump ANTLR runtime 
version to 4.8 in Spark 3.2)

> Bump ANTLR runtime version in Spark 3.x
> ---
>
> Key: HUDI-4111
> URL: https://issues.apache.org/jira/browse/HUDI-4111
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: dzcxzl
>Priority: Trivial
>  Labels: pull-request-available
>
> Spark3.2 uses antlr version 4.8, Hudi uses 4.7, use the same version to avoid 
> a log of antlr check versions.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HUDI-4111) Bump ANTLR runtime version to 4.8 in Spark 3.2

2022-05-17 Thread dzcxzl (Jira)
dzcxzl created HUDI-4111:


 Summary: Bump ANTLR runtime version to 4.8 in Spark 3.2
 Key: HUDI-4111
 URL: https://issues.apache.org/jira/browse/HUDI-4111
 Project: Apache Hudi
  Issue Type: Improvement
Reporter: dzcxzl


Spark3.2 uses antlr version 4.8, Hudi uses 4.7, use the same version to avoid a 
log of antlr check versions.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (HUDI-3849) AvroDeserializer supports AVRO_REBASE_MODE_IN_READ configuration

2022-04-11 Thread dzcxzl (Jira)
dzcxzl created HUDI-3849:


 Summary: AvroDeserializer supports AVRO_REBASE_MODE_IN_READ 
configuration
 Key: HUDI-3849
 URL: https://issues.apache.org/jira/browse/HUDI-3849
 Project: Apache Hudi
  Issue Type: Improvement
  Components: spark
Reporter: dzcxzl


Now the datetimeRebaseMode of AvroDeserializer is hardcode and the value is 
"EXCEPTION"



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HUDI-973) RemoteHoodieTableFileSystemView supports non-partitioned table queries

2020-05-27 Thread dzcxzl (Jira)
dzcxzl created HUDI-973:
---

 Summary: RemoteHoodieTableFileSystemView supports non-partitioned 
table queries
 Key: HUDI-973
 URL: https://issues.apache.org/jira/browse/HUDI-973
 Project: Apache Hudi
  Issue Type: Bug
Reporter: dzcxzl


When hoodie.embed.timeline.server = true, the written table is a 
non-partitioned table, will get an exception.

 
{code:java}
io.javalin.BadRequestResponse: Query parameter 'partition' with value '' cannot 
be null or empty
at io.javalin.validation.TypedValidator.getOrThrow(Validator.kt:25)
at 
org.apache.hudi.timeline.service.FileSystemViewHandler.lambda$registerDataFilesAPI$3(FileSystemViewHandler.java:172)

{code}
 

Because api checks whether the value of partition is null or empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-889) Writer supports useJdbc configuration when hive synchronization is enabled

2020-05-13 Thread dzcxzl (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated HUDI-889:

Status: In Progress  (was: Open)

> Writer supports useJdbc configuration when hive synchronization is enabled
> --
>
> Key: HUDI-889
> URL: https://issues.apache.org/jira/browse/HUDI-889
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: dzcxzl
>Priority: Trivial
>
> hudi-hive-sync supports the useJdbc = false configuration, but the writer 
> does not provide this configuration at this stage



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-889) Writer supports useJdbc configuration when hive synchronization is enabled

2020-05-13 Thread dzcxzl (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated HUDI-889:

Status: Closed  (was: Patch Available)

> Writer supports useJdbc configuration when hive synchronization is enabled
> --
>
> Key: HUDI-889
> URL: https://issues.apache.org/jira/browse/HUDI-889
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: dzcxzl
>Priority: Trivial
>
> hudi-hive-sync supports the useJdbc = false configuration, but the writer 
> does not provide this configuration at this stage



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-889) Writer supports useJdbc configuration when hive synchronization is enabled

2020-05-13 Thread dzcxzl (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated HUDI-889:

Status: Patch Available  (was: In Progress)

> Writer supports useJdbc configuration when hive synchronization is enabled
> --
>
> Key: HUDI-889
> URL: https://issues.apache.org/jira/browse/HUDI-889
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: dzcxzl
>Priority: Trivial
>
> hudi-hive-sync supports the useJdbc = false configuration, but the writer 
> does not provide this configuration at this stage



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HUDI-889) Writer supports useJdbc configuration when hive synchronization is enabled

2020-05-13 Thread dzcxzl (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dzcxzl updated HUDI-889:

Status: Open  (was: New)

> Writer supports useJdbc configuration when hive synchronization is enabled
> --
>
> Key: HUDI-889
> URL: https://issues.apache.org/jira/browse/HUDI-889
> Project: Apache Hudi (incubating)
>  Issue Type: Improvement
>  Components: Writer Core
>Reporter: dzcxzl
>Priority: Trivial
>
> hudi-hive-sync supports the useJdbc = false configuration, but the writer 
> does not provide this configuration at this stage



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HUDI-889) Writer supports useJdbc configuration when hive synchronization is enabled

2020-05-12 Thread dzcxzl (Jira)
dzcxzl created HUDI-889:
---

 Summary: Writer supports useJdbc configuration when hive 
synchronization is enabled
 Key: HUDI-889
 URL: https://issues.apache.org/jira/browse/HUDI-889
 Project: Apache Hudi (incubating)
  Issue Type: Improvement
  Components: Writer Core
Reporter: dzcxzl


hudi-hive-sync supports the useJdbc = false configuration, but the writer does 
not provide this configuration at this stage



--
This message was sent by Atlassian Jira
(v8.3.4#803005)