Re: Dimension column of integer type - to exclude from dictionary
SORT_COLUMNS can add a numeric type column to a dimension without dictionary encoding. SORT_COLUMNS feature was implemented in 12-dev branch. Best Regards David QiangCai -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Dimension-column-of-integer-type-to-exclude-from-dictionary-tp9961p9977.html Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.
Re: Problem with creating a table in Spark 2.
Hi Marek, Can you please check the permissions of /user/hive/warehouse/carbon directory. Regards Bhavya On Mon, Apr 3, 2017 at 6:05 PM, Marek Wiewiorka wrote: > Hi All - I'm trying to follow an example from the quick start guide and in > spark-shell trying to create a carbondata table in the following way: > > import org.apache.spark.sql.SparkSession import > org.apache.spark.sql.CarbonSession._val carbon = > SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("/ > user/hive/warehouse/carbon","/mnt/hadoop/data/ssd001/tmp/mwiewior") > > > and then just copy paste from the documentation: > > scala> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name > string, city string, age Int) STORED BY 'carbondata'") > AUDIT 03-04 13:20:25,534 - [c01][hive][Thread-1]Creating Table with > Database name [default] and Table name [test_table] > java.io.FileNotFoundException: > /user/hive/warehouse/carbon/default/test_table/Metadata/schema (No such > file or directory) > at java.io.FileOutputStream.open0(Native Method) > at java.io.FileOutputStream.open(FileOutputStream.java:270) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:133) > at > org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream( > FileFactory.java:207) > > at > org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:76) > at > org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift( > CarbonMetastore.scala:330) > > at > org.apache.spark.sql.execution.command.CreateTable. > run(carbonTableSchema.scala:162) > > at > org.apache.spark.sql.execution.command.ExecutedCommandExec. > sideEffectResult$lzycompute(commands.scala:58) > > at > org.apache.spark.sql.execution.command.ExecutedCommandExec. > sideEffectResult(commands.scala:56) > > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute( > commands.scala:74) > > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$1.apply(SparkPlan.scala:114) > > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$1.apply(SparkPlan.scala:114) > > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply( > SparkPlan.scala:135) > > at > org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:151) > > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute( > QueryExecution.scala:87) > > at > org.apache.spark.sql.execution.QueryExecution. > toRdd(QueryExecution.scala:87) > > at org.apache.spark.sql.Dataset.(Dataset.scala:185) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) > ... 50 elided > > > Could you please help me to troubleshoot this problem? > > Thanks! > Marek >
Re: Problem with creating a table in Spark 2.
Hi Please check if the below path is correct in your machine? /user/hive/warehouse/carbon/ Regards Liang 2017-04-03 18:05 GMT+05:30 Marek Wiewiorka : > Hi All - I'm trying to follow an example from the quick start guide and in > spark-shell trying to create a carbondata table in the following way: > > import org.apache.spark.sql.SparkSession import > org.apache.spark.sql.CarbonSession._val carbon = > SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("/ > user/hive/warehouse/carbon","/mnt/hadoop/data/ssd001/tmp/mwiewior") > > > and then just copy paste from the documentation: > > scala> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name > string, city string, age Int) STORED BY 'carbondata'") > AUDIT 03-04 13:20:25,534 - [c01][hive][Thread-1]Creating Table with > Database name [default] and Table name [test_table] > java.io.FileNotFoundException: > /user/hive/warehouse/carbon/default/test_table/Metadata/schema (No such > file or directory) > at java.io.FileOutputStream.open0(Native Method) > at java.io.FileOutputStream.open(FileOutputStream.java:270) > at java.io.FileOutputStream.(FileOutputStream.java:213) > at java.io.FileOutputStream.(FileOutputStream.java:133) > at > org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream( > FileFactory.java:207) > > at > org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:76) > at > org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift( > CarbonMetastore.scala:330) > > at > org.apache.spark.sql.execution.command.CreateTable. > run(carbonTableSchema.scala:162) > > at > org.apache.spark.sql.execution.command.ExecutedCommandExec. > sideEffectResult$lzycompute(commands.scala:58) > > at > org.apache.spark.sql.execution.command.ExecutedCommandExec. > sideEffectResult(commands.scala:56) > > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute( > commands.scala:74) > > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$1.apply(SparkPlan.scala:114) > > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$ > execute$1.apply(SparkPlan.scala:114) > > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply( > SparkPlan.scala:135) > > at > org.apache.spark.rdd.RDDOperationScope$.withScope( > RDDOperationScope.scala:151) > > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute( > QueryExecution.scala:87) > > at > org.apache.spark.sql.execution.QueryExecution. > toRdd(QueryExecution.scala:87) > > at org.apache.spark.sql.Dataset.(Dataset.scala:185) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) > ... 50 elided > > > Could you please help me to troubleshoot this problem? > > Thanks! > Marek > -- Regards Liang
[jira] [Created] (CARBONDATA-847) Select query not working properly after alter.
SWATI RAO created CARBONDATA-847: Summary: Select query not working properly after alter. Key: CARBONDATA-847 URL: https://issues.apache.org/jira/browse/CARBONDATA-847 Project: CarbonData Issue Type: Bug Affects Versions: 1.1.0-incubating Environment: Spark2.1 Reporter: SWATI RAO Attachments: 2000_UniqData.csv Execute these set of queries: CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB"); LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); ALTER TABLE uniqdata RENAME TO uniqdata1; alter table uniqdata1 add columns(dict int) TBLPROPERTIES('DICTIONARY_INCLUDE'='dict','DEFAULT.VALUE.dict'= ''); select distinct(dict) from uniqdata2 ; it will display the result but when we perform : select * from uniqdata1 ; it will display an error message : Job aborted due to stage failure: Task 3 in stage 59.0 failed 1 times, most recent failure: Lost task 3.0 in stage 59.0 (TID 714, localhost, executor driver): java.lang.NullPointerException -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Problem with creating a table in Spark 2.
Hi All - I'm trying to follow an example from the quick start guide and in spark-shell trying to create a carbondata table in the following way: import org.apache.spark.sql.SparkSession import org.apache.spark.sql.CarbonSession._val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("/user/hive/warehouse/carbon","/mnt/hadoop/data/ssd001/tmp/mwiewior") and then just copy paste from the documentation: scala> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name string, city string, age Int) STORED BY 'carbondata'") AUDIT 03-04 13:20:25,534 - [c01][hive][Thread-1]Creating Table with Database name [default] and Table name [test_table] java.io.FileNotFoundException: /user/hive/warehouse/carbon/default/test_table/Metadata/schema (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:133) at org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207) at org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:76) at org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:330) at org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:162) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87) at org.apache.spark.sql.Dataset.(Dataset.scala:185) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) ... 50 elided Could you please help me to troubleshoot this problem? Thanks! Marek
[jira] [Created] (CARBONDATA-846) Add support to revert changes to alter table commands if there is a failure while executing the changes on hive.
Kunal Kapoor created CARBONDATA-846: --- Summary: Add support to revert changes to alter table commands if there is a failure while executing the changes on hive. Key: CARBONDATA-846 URL: https://issues.apache.org/jira/browse/CARBONDATA-846 Project: CarbonData Issue Type: Improvement Reporter: Kunal Kapoor Assignee: Kunal Kapoor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-845) Insert Select into same table is not working
sounak chakraborty created CARBONDATA-845: - Summary: Insert Select into same table is not working Key: CARBONDATA-845 URL: https://issues.apache.org/jira/browse/CARBONDATA-845 Project: CarbonData Issue Type: Bug Affects Versions: 1.1.0-incubating Reporter: sounak chakraborty Priority: Minor Insert Select from same table is not working in Spark-2.1. Insert into table1 select * from table1 giving error Error: org.apache.spark.sql.AnalysisException: Cannot insert overwrite into table that is also being read from. -- This message was sent by Atlassian JIRA (v6.3.15#6346)