[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-10-04 Thread Prashant Wason (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Wason updated HUDI-5609:
-
Fix Version/s: 0.14.1
   (was: 0.14.0)

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark-sql
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.14.1
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-05-22 Thread Yue Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yue Zhang updated HUDI-5609:

Fix Version/s: 0.14.0
   (was: 0.13.1)

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark-sql
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.14.0
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-04-23 Thread sivabalan narayanan (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-5609:
--
Fix Version/s: (was: 0.12.3)

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark-sql
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.1
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-03-09 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5609:
-
Fix Version/s: 0.12.3

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.1, 0.12.3
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-03-09 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5609:
-
Component/s: spark-sql

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Improvement
>  Components: spark-sql
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.1, 0.12.3
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-03-09 Thread Raymond Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raymond Xu updated HUDI-5609:
-
Issue Type: Bug  (was: Improvement)

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Bug
>  Components: spark-sql
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.1, 0.12.3
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-01-24 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5609:

Description: 
Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
the table format version also need to be updated? i.e. we’re writing with Hudi 
0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 3.3.

 

What have been tried so far on 0.12.2:
 # 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
 SparkSQL

so just tried Spark SQL and doesn’t work (different issue)
SET hoodie.file.index.enable=false
select count(*) from validated_sales;
returns 0 count but no errors
2. 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
 when running via pyspark
%python
df = spark.read.format('hudi')\
.load('s3:///validated_sales/*/*/*')
df.count()
all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
3. 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
 without the wildcard in pyspark
%python
df = spark.read.format('hudi')\
.load('s3:///validated_sales')
df.count()
count = 0
4. 
!https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
 without wildcard but with recursive option set in pyspark
%python
df = spark.read.format('hudi')\
.option("recursiveFileLookup","true")\
.load('s3:///validated_sales')
df.count()
count = 250k 

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.1
>
>
> Customer: I’ve tried this with 0.12.2 and still receive the same error. does 
> the table format version also need to be updated? i.e. we’re writing with 
> Hudi 0.11.1 using EMR but reading from Databricks using Hudi 0.12.2 and Spark 
> 3.3.
>  
> What have been tried so far on 0.12.2:
>  # 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  SparkSQL
> so just tried Spark SQL and doesn’t work (different issue)
> SET hoodie.file.index.enable=false
> select count(*) from validated_sales;
> returns 0 count but no errors
> 2. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  when running via pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales/*/*/*')
> df.count()
> all is good with 0.12.2 Hudi and Databricks 11.3 (spark 3.3).
> 3. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without the wildcard in pyspark
> %python
> df = spark.read.format('hudi')\
> .load('s3:///validated_sales')
> df.count()
> count = 0
> 4. 
> !https://a.slack-edge.com/production-standard-emoji-assets/14.0/apple-medium/2...@2x.png!
>  without wildcard but with recursive option set in pyspark
> %python
> df = spark.read.format('hudi')\
> .option("recursiveFileLookup","true")\
> .load('s3:///validated_sales')
> df.count()
> count = 250k 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-01-24 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5609:

Priority: Blocker  (was: Major)

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Assignee: Ethan Guo
>Priority: Blocker
> Fix For: 0.13.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HUDI-5609) Hudi table not queryable by SQL on Databricks Spark

2023-01-24 Thread Ethan Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/HUDI-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5609:

Fix Version/s: 0.13.1

> Hudi table not queryable by SQL on Databricks Spark
> ---
>
> Key: HUDI-5609
> URL: https://issues.apache.org/jira/browse/HUDI-5609
> Project: Apache Hudi
>  Issue Type: Improvement
>Reporter: Ethan Guo
>Priority: Major
> Fix For: 0.13.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)