[
https://issues.apache.org/jira/browse/HUDI-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aditya Goenka updated HUDI-7211:
Description:
[https://github.com/apache/hudi/issues/10233]
```
NOW=$(date '+%Y%m%dt%H%M%S')
${SPARK_HOME}/bin/spark-submit \
--jars
${path_prefix}/jars/${SPARK_V}/hudi-spark${SPARK_VERSION}-bundle_2.12-${HUDI_VERSION}.jar
\
--class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
${path_prefix}/jars/${SPARK_V}/hudi-utilities-slim-bundle_2.12-${HUDI_VERSION}.jar
\
--target-base-path ${path_prefix}/testcases/stocks/data/target/${NOW} \
--target-table stocks${NOW} \
--table-type COPY_ON_WRITE \
--base-file-format PARQUET \
--props ${path_prefix}/testcases/stocks/configs/hoodie.properties \
--source-class org.apache.hudi.utilities.sources.JsonDFSSource \
--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider
\
--hoodie-conf
hoodie.deltastreamer.schemaprovider.source.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
\
--hoodie-conf
hoodie.deltastreamer.schemaprovider.target.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
\
--op UPSERT \
--spark-master yarn \
--hoodie-conf
hoodie.deltastreamer.source.dfs.root=${path_prefix}/testcases/stocks/data/source_without_ts
\
--hoodie-conf hoodie.datasource.write.partitionpath.field=date \
--hoodie-conf hoodie.datasource.write.keygenerator.type=SIMPLE \
--hoodie-conf hoodie.datasource.write.hive_style_partitioning=false \
--hoodie-conf hoodie.metadata.enable=true
```
was:
[https://github.com/apache/hudi/issues/10233]
Reproducible code -
https://github.com/apache/hudi/issues/10233#issuecomment-1849561433
> Relax need of ordering/precombine field for tables with autogenerated record
> keys for DeltaStreamer
> ---
>
> Key: HUDI-7211
> URL: https://issues.apache.org/jira/browse/HUDI-7211
> Project: Apache Hudi
> Issue Type: Bug
> Components: writer-core
>Reporter: Aditya Goenka
>Priority: Critical
> Fix For: 0.14.1
>
>
> [https://github.com/apache/hudi/issues/10233]
>
> ```
> NOW=$(date '+%Y%m%dt%H%M%S')
> ${SPARK_HOME}/bin/spark-submit \
> --jars
> ${path_prefix}/jars/${SPARK_V}/hudi-spark${SPARK_VERSION}-bundle_2.12-${HUDI_VERSION}.jar
> \
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
> ${path_prefix}/jars/${SPARK_V}/hudi-utilities-slim-bundle_2.12-${HUDI_VERSION}.jar
> \
> --target-base-path ${path_prefix}/testcases/stocks/data/target/${NOW} \
> --target-table stocks${NOW} \
> --table-type COPY_ON_WRITE \
> --base-file-format PARQUET \
> --props ${path_prefix}/testcases/stocks/configs/hoodie.properties \
> --source-class org.apache.hudi.utilities.sources.JsonDFSSource \
> --schemaprovider-class
> org.apache.hudi.utilities.schema.FilebasedSchemaProvider \
> --hoodie-conf
> hoodie.deltastreamer.schemaprovider.source.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
> \
> --hoodie-conf
> hoodie.deltastreamer.schemaprovider.target.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
> \
> --op UPSERT \
> --spark-master yarn \
> --hoodie-conf
> hoodie.deltastreamer.source.dfs.root=${path_prefix}/testcases/stocks/data/source_without_ts
> \
> --hoodie-conf hoodie.datasource.write.partitionpath.field=date \
> --hoodie-conf hoodie.datasource.write.keygenerator.type=SIMPLE \
> --hoodie-conf hoodie.datasource.write.hive_style_partitioning=false \
> --hoodie-conf hoodie.metadata.enable=true
> ```
--
This message was sent by Atlassian Jira
(v8.20.10#820010)