[ 
https://issues.apache.org/jira/browse/HUDI-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Goenka updated HUDI-7211:
--------------------------------
    Description: 
[https://github.com/apache/hudi/issues/10233]

 

```
NOW=$(date '+%Y%m%dt%H%M%S')
${SPARK_HOME}/bin/spark-submit \
--jars 
${path_prefix}/jars/${SPARK_V}/hudi-spark${SPARK_VERSION}-bundle_2.12-${HUDI_VERSION}.jar
 \
--class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
${path_prefix}/jars/${SPARK_V}/hudi-utilities-slim-bundle_2.12-${HUDI_VERSION}.jar
 \
--target-base-path ${path_prefix}/testcases/stocks/data/target/${NOW} \
--target-table stocks${NOW} \
--table-type COPY_ON_WRITE \
--base-file-format PARQUET \
--props ${path_prefix}/testcases/stocks/configs/hoodie.properties \
--source-class org.apache.hudi.utilities.sources.JsonDFSSource \
--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider 
\
--hoodie-conf 
hoodie.deltastreamer.schemaprovider.source.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
 \
--hoodie-conf 
hoodie.deltastreamer.schemaprovider.target.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
 \
--op UPSERT \
--spark-master yarn \
--hoodie-conf 
hoodie.deltastreamer.source.dfs.root=${path_prefix}/testcases/stocks/data/source_without_ts
 \
--hoodie-conf hoodie.datasource.write.partitionpath.field=date \
--hoodie-conf hoodie.datasource.write.keygenerator.type=SIMPLE \
--hoodie-conf hoodie.datasource.write.hive_style_partitioning=false \
--hoodie-conf hoodie.metadata.enable=true
```

  was:
[https://github.com/apache/hudi/issues/10233]

 

Reproducible code - 
https://github.com/apache/hudi/issues/10233#issuecomment-1849561433


> Relax need of ordering/precombine field for tables with autogenerated record 
> keys for DeltaStreamer
> ---------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-7211
>                 URL: https://issues.apache.org/jira/browse/HUDI-7211
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: writer-core
>            Reporter: Aditya Goenka
>            Priority: Critical
>             Fix For: 0.14.1
>
>
> [https://github.com/apache/hudi/issues/10233]
>  
> ```
> NOW=$(date '+%Y%m%dt%H%M%S')
> ${SPARK_HOME}/bin/spark-submit \
> --jars 
> ${path_prefix}/jars/${SPARK_V}/hudi-spark${SPARK_VERSION}-bundle_2.12-${HUDI_VERSION}.jar
>  \
> --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
> ${path_prefix}/jars/${SPARK_V}/hudi-utilities-slim-bundle_2.12-${HUDI_VERSION}.jar
>  \
> --target-base-path ${path_prefix}/testcases/stocks/data/target/${NOW} \
> --target-table stocks${NOW} \
> --table-type COPY_ON_WRITE \
> --base-file-format PARQUET \
> --props ${path_prefix}/testcases/stocks/configs/hoodie.properties \
> --source-class org.apache.hudi.utilities.sources.JsonDFSSource \
> --schemaprovider-class 
> org.apache.hudi.utilities.schema.FilebasedSchemaProvider \
> --hoodie-conf 
> hoodie.deltastreamer.schemaprovider.source.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
>  \
> --hoodie-conf 
> hoodie.deltastreamer.schemaprovider.target.schema.file=${path_prefix}/testcases/stocks/data/schema_without_ts.avsc
>  \
> --op UPSERT \
> --spark-master yarn \
> --hoodie-conf 
> hoodie.deltastreamer.source.dfs.root=${path_prefix}/testcases/stocks/data/source_without_ts
>  \
> --hoodie-conf hoodie.datasource.write.partitionpath.field=date \
> --hoodie-conf hoodie.datasource.write.keygenerator.type=SIMPLE \
> --hoodie-conf hoodie.datasource.write.hive_style_partitioning=false \
> --hoodie-conf hoodie.metadata.enable=true
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to