Hi Team,
I am an Architect in a reputable Product based IT firm.
I am in the evaluation process to use Hudi to incorporate a refreshable
data lake.
I am currently running the setup in my local machine and using a spark
datasource to write and read from the Hudi temp table.
I have evaluated the Cow and MoR write mechanisms but while trying to read
the Hudi table using Read_Optimized type I am getting the below exception:
Exception in thread "main" org.apache.hudi.exception.HoodieException:
Invalid query type :read_optimized
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:81)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:46)
at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:332)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:242)
Below is the how I am trying to read from the Hudi location:
spark.read
.format("hudi")
.option(DataSourceReadOptions.QUERY_TYPE_OPT_KEY,
DataSourceReadOptions.QUERY_TYPE_READ_OPTIMIZED_OPT_VAL)
.load(s"$basePath/$tableName")
.show(50,false)
Kindly suggest if I am doing anything wrong?
Below are the versions I am currently using :
apache hudi 0.7.0
spark 2.4.7
scala 2.12
Kindly let me know in case anymore details are required.
Regards,
Ambarish Giri
9951742695