[jira] [Resolved] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4273.
-
Fix Version/s: 2.3.0
 Assignee: Indhumathi Muthumurugesh
   Resolution: Fixed

> Cannot create table with partitions in Spark in EMR
> ---
>
> Key: CARBONDATA-4273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4273
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.2.0
> Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, 
> JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.2.0
> spark:2.4.5
> hadoop:2.8.3
>Reporter: Bigicecream
>Assignee: Indhumathi Muthumurugesh
>Priority: Critical
>  Labels: EMR, spark
> Fix For: 2.3.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
>  
> When trying to create a table like this:
> {code:sql}
> CREATE TABLE IF NOT EXISTS will_not_work(
> timestamp string,
> name string
> )
> PARTITIONED BY (dt string, hr string)
> STORED AS carbondata
> LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
> {code}
> The folder 's3a://my-bucket/CarbonDataTests/will_not_work' is a not existing 
> folder
> I get the following error:
> {noformat}
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Partition is not supported for external table
>   at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
>   at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
>   at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
>   ... 64 elided
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4271) Support DPP for carbon filters

2021-08-31 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-4271.
--
Fix Version/s: 2.3.0
   Resolution: Fixed

> Support DPP for carbon filters
> --
>
> Key: CARBONDATA-4271
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4271
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Indhumathi
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Bigicecream (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bigicecream updated CARBONDATA-4273:

Description: 
 

When trying to create a table like this:
{code:sql}
CREATE TABLE IF NOT EXISTS will_not_work(
timestamp string,
name string
)
PARTITIONED BY (dt string, hr string)
STORED AS carbondata
LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
{code}
The folder 's3a://my-bucket/CarbonDataTests/will_not_work' is a not existing 
folder
I get the following error:
{noformat}
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Partition is not supported for external table
  at 
org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
  at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
  at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
  at 
org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
  at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
  at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
  at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
  at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
  at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
  at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
  at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
  at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
  at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
  at org.apache.spark.sql.Dataset.(Dataset.scala:194)
  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
  ... 64 elided
{noformat}

  was:
 

When trying to create a table like this:
{code:sql}
CREATE TABLE IF NOT EXISTS will_not_work(
timestamp string,
name string
)
PARTITIONED BY (dt string, hr string)
STORED AS carbondata
LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
{code}
I get the following error:
{noformat}
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Partition is not supported for external table
  at 
org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
  at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
  at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
  at 
org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
  at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
  at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
  at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
  at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
  at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
  at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
  at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
  at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
  at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
  at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
  at org.apache.spark.sql.Dataset.(Dataset.scala:194)
  at

[jira] [Commented] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Bigicecream (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407323#comment-17407323
 ] 

Bigicecream commented on CARBONDATA-4273:
-

[~indhumuthumurugesh]

The folder is not existing, when the folder is empty the error is the same.

I specify the location because I want the table files to be written to the 
location I specify.

> Cannot create table with partitions in Spark in EMR
> ---
>
> Key: CARBONDATA-4273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4273
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.2.0
> Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, 
> JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.2.0
> spark:2.4.5
> hadoop:2.8.3
>Reporter: Bigicecream
>Priority: Critical
>  Labels: EMR, spark
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  
> When trying to create a table like this:
> {code:sql}
> CREATE TABLE IF NOT EXISTS will_not_work(
> timestamp string,
> name string
> )
> PARTITIONED BY (dt string, hr string)
> STORED AS carbondata
> LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
> {code}
> I get the following error:
> {noformat}
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Partition is not supported for external table
>   at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
>   at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
>   at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
>   ... 64 elided
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Indhumathi (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407301#comment-17407301
 ] 

Indhumathi edited comment on CARBONDATA-4273 at 8/31/21, 12:47 PM:
---

[~bigicecream] what kind of data is present in location 
s3a://my-bucket/CarbonDataTests/will_not_work ?

 
is the location is empty (or) it has some partition folder which holds carbon 
data and index files ?


was (Author: indhumuthumurugesh):
[~bigicecream] what kind of data is present in location 
s3a://my-bucket/CarbonDataTests/will_not_work ?

> Cannot create table with partitions in Spark in EMR
> ---
>
> Key: CARBONDATA-4273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4273
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.2.0
> Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, 
> JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.2.0
> spark:2.4.5
> hadoop:2.8.3
>Reporter: Bigicecream
>Priority: Critical
>  Labels: EMR, spark
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  
> When trying to create a table like this:
> {code:sql}
> CREATE TABLE IF NOT EXISTS will_not_work(
> timestamp string,
> name string
> )
> PARTITIONED BY (dt string, hr string)
> STORED AS carbondata
> LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
> {code}
> I get the following error:
> {noformat}
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Partition is not supported for external table
>   at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
>   at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
>   at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
>   ... 64 elided
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Indhumathi Muthumurugesh (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi Muthumurugesh updated CARBONDATA-4273:
-
Comment: was deleted

(was: is the location is empty (or) it has some partition folder which holds 
carbon data and index files ?)

> Cannot create table with partitions in Spark in EMR
> ---
>
> Key: CARBONDATA-4273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4273
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.2.0
> Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, 
> JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.2.0
> spark:2.4.5
> hadoop:2.8.3
>Reporter: Bigicecream
>Priority: Critical
>  Labels: EMR, spark
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  
> When trying to create a table like this:
> {code:sql}
> CREATE TABLE IF NOT EXISTS will_not_work(
> timestamp string,
> name string
> )
> PARTITIONED BY (dt string, hr string)
> STORED AS carbondata
> LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
> {code}
> I get the following error:
> {noformat}
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Partition is not supported for external table
>   at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
>   at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
>   at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
>   ... 64 elided
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Indhumathi Muthumurugesh (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407306#comment-17407306
 ] 

Indhumathi Muthumurugesh commented on CARBONDATA-4273:
--

is the location is empty (or) it has some partition folder which holds carbon 
data and index files ?

> Cannot create table with partitions in Spark in EMR
> ---
>
> Key: CARBONDATA-4273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4273
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.2.0
> Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, 
> JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.2.0
> spark:2.4.5
> hadoop:2.8.3
>Reporter: Bigicecream
>Priority: Critical
>  Labels: EMR, spark
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  
> When trying to create a table like this:
> {code:sql}
> CREATE TABLE IF NOT EXISTS will_not_work(
> timestamp string,
> name string
> )
> PARTITIONED BY (dt string, hr string)
> STORED AS carbondata
> LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
> {code}
> I get the following error:
> {noformat}
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Partition is not supported for external table
>   at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
>   at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
>   at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
>   ... 64 elided
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4273) Cannot create table with partitions in Spark in EMR

2021-08-31 Thread Indhumathi (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407301#comment-17407301
 ] 

Indhumathi commented on CARBONDATA-4273:


[~bigicecream] what kind of data is present in location 
s3a://my-bucket/CarbonDataTests/will_not_work ?

> Cannot create table with partitions in Spark in EMR
> ---
>
> Key: CARBONDATA-4273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4273
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.2.0
> Environment: Release label:emr-5.24.1
> Hadoop distribution:Amazon 2.8.5
> Applications:
> Hive 2.3.4, Pig 0.17.0, Hue 4.4.0, Flink 1.8.0, Spark 2.4.2, Presto 0.219, 
> JupyterHub 0.9.6
> Jar complied with:
> apache-carbondata:2.2.0
> spark:2.4.5
> hadoop:2.8.3
>Reporter: Bigicecream
>Priority: Critical
>  Labels: EMR, spark
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
>  
> When trying to create a table like this:
> {code:sql}
> CREATE TABLE IF NOT EXISTS will_not_work(
> timestamp string,
> name string
> )
> PARTITIONED BY (dt string, hr string)
> STORED AS carbondata
> LOCATION 's3a://my-bucket/CarbonDataTests/will_not_work
> {code}
> I get the following error:
> {noformat}
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Partition is not supported for external table
>   at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:219)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235)
>   at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:394)
>   at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:118)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:134)
>   at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:137)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3364)
>   at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3363)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:643)
>   ... 64 elided
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4274) Create partition table error with spark 3.1

2021-08-31 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-4274.
--
Fix Version/s: 2.3.0
   Resolution: Fixed

>  Create partition table error with spark 3.1
> 
>
> Key: CARBONDATA-4274
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4274
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> With spark 3.1, we can create a partition table by giving partition columns 
> from schema.
> Like below example:
> {{create table partitionTable(c1 int, c2 int, v1 string, v2 string) stored as 
> carbondata partitioned by (v2,c2)}}
> When the table is created by SparkSession with CarbonExtension, catalog table 
> is created with the specified partitions.
> But in cluster/ with carbon session, when we create partition table with 
> above syntax it is creating normal table with no partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)