[jira] [Created] (CARBONDATA-2222) Update the FAQ doc for some mistakes
chenerlu created CARBONDATA-: Summary: Update the FAQ doc for some mistakes Key: CARBONDATA- URL: https://issues.apache.org/jira/browse/CARBONDATA- Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-1895) Fix issue of create table if not exits
chenerlu created CARBONDATA-1895: Summary: Fix issue of create table if not exits Key: CARBONDATA-1895 URL: https://issues.apache.org/jira/browse/CARBONDATA-1895 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1835) Fix null exception when get table details
chenerlu created CARBONDATA-1835: Summary: Fix null exception when get table details Key: CARBONDATA-1835 URL: https://issues.apache.org/jira/browse/CARBONDATA-1835 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-1778) Support clean garbage segments for all
[ https://issues.apache.org/jira/browse/CARBONDATA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258972#comment-16258972 ] chenerlu commented on CARBONDATA-1778: -- Now Carbon only support clean garbage segments for specified table. Carbon should provide the ability to clean all garbage segments without specified the database name and table name. > Support clean garbage segments for all > -- > > Key: CARBONDATA-1778 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1778 > Project: CarbonData > Issue Type: Improvement >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1778) Support clean garbage segments for all
chenerlu created CARBONDATA-1778: Summary: Support clean garbage segments for all Key: CARBONDATA-1778 URL: https://issues.apache.org/jira/browse/CARBONDATA-1778 Project: CarbonData Issue Type: Improvement Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1618) Fix issue of not supporting table comment
chenerlu created CARBONDATA-1618: Summary: Fix issue of not supporting table comment Key: CARBONDATA-1618 URL: https://issues.apache.org/jira/browse/CARBONDATA-1618 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1438) Unify the sort column and sort scope in create table command
[ https://issues.apache.org/jira/browse/CARBONDATA-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1438: - Description: 1 Requirement Currently, Users can specify sort column in table properties when create table. And when load data, users can also specify sort scope in load options. In order to improve the ease of use for users, it will be better to specify the sort related parameters all in create table command. Once sort scope is specified in create table command, it will be used in load data even users have specified in load options. 2 Detailed design 2.1 Task-01 Requirement: Create table can support specify sort scope Implement: Take use of table properties (Map), will specify sort scope in table properties by key/value pair, then existing interface will be called to write this key/value pair into metastore. Will support Global Sort,Local Sort and No Sort,it can be specified in sql command: CREATE TABLE tableWithGlobalSort ( shortField SHORT, intField INT, bigintField LONG, doubleField DOUBLE, stringField STRING, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5) ) STORED BY 'carbondata' TBLPROPERTIES('SORT_COLUMNS'='stringField', 'SORT_SCOPE'='GLOBAL_SORT') Tips:If the sort scope is global Sort, users should specify GLOBAL_SORT_PARTITIONS. If users do not specify it, it will use the number of map task. GLOBAL_SORT_PARTITIONS should be Integer type, the range is [1,Integer.MaxValue],it is only used when the sort scope is global sort. Global Sort Use orderby operator in spark, data is ordered in segment level. Local Sort Node ordered, carbondata file is ordered if it is written by one task. No Sort No sort Tips:key and value is case-insensitive. 2.2 Task-02 Requirement: Load data in will support local sort, no sort, global sort Ignore the sort scope specified in load data and use the parameter which specified in create table. Currently, user can specify the sort scope and global sort partitions in load options, After modification, it will ignore the sort scope which specified in load options and will get sort scope from table properties. Current logic: sort scope is from load options Number PrerequisiteSort scope 1 isSortTable is true && Sort Scope is Global SortGlobal Sort(first check) 2 isSortTable is falseNo Sort 3 isSortTable is true Local Sort Tips: isSortTable is true means this table contains sort column or it contains dimensions (except complex type), like string type. For example: Create table xxx1 (col1 string col2 int) stored by ‘carbondata’ --- sort table Create table xx1 (col1 int, col2 int) stored by ‘carbondata’ --- not sort table Create table xx (col1 int, col2 string) stored by ‘carbondata’ tblproperties (‘sort_column’=’col1’) –- sort table New logic:sort scope is from create table Number PrerequisiteCode branch 1 isSortTable = true && Sort Scope is Global Sort Global Sort(first check) 2 isSortTable= false || Sort Scope is No Sort No Sort 3 isSortTable is true && Sort Scope is Local Sort Local Sort 4 isSortTable is true,without specify Sort Scope Local Sort, (Keep current logic) 3 Acceptance standard Number Acceptance standard 1 Use can specify sort scope(global, local, no sort) when create carbon table in sql type 2 Load data will ignore the sort scope specified in load options and will use the parameter which specify in create table command. If user still specify the sort scope in load options, will give warning and inform user that he will use the sort scope which specified in create table. 4 Feature restrictions NA 5 Dependencies NA 6 Technical risk NA was: 1 Requirement Currently, Users can specify sort column in table properties when create table. And when load data, users can also specify sort scope in load options. In order to improve the ease of use for users, it will be better to specify the sort related parameters all in create table command. Once sort scope is specified in create table command, it will be used in load data even users have specified in load options. 2 Detailed design 2.1 Task-01 Requirement: Create table can support specify sort scope Implement: Take use of table properties (Map), will specify sort scope in table properties by key/value pair, then existing interface will be called to write this key/value pair into metastore. Will support Global Sort,Local Sort and No Sort,it can be specified in sql command: CREATE TABLE tableWithGlobalSort ( shortField SHORT, intField INT, bigintField LONG, doubleField DOUBLE, stringField STRING, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5) ) STORED BY 'carbondata' TBLPROPERTIES('SORT_COLUMNS'='stringField', 'SORT_SCOPE'='GLOBAL_SORT') T
[jira] [Created] (CARBONDATA-1438) Unify the sort column and sort scope in create table command
chenerlu created CARBONDATA-1438: Summary: Unify the sort column and sort scope in create table command Key: CARBONDATA-1438 URL: https://issues.apache.org/jira/browse/CARBONDATA-1438 Project: CarbonData Issue Type: Bug Reporter: chenerlu 1 Requirement Currently, Users can specify sort column in table properties when create table. And when load data, users can also specify sort scope in load options. In order to improve the ease of use for users, it will be better to specify the sort related parameters all in create table command. Once sort scope is specified in create table command, it will be used in load data even users have specified in load options. 2 Detailed design 2.1 Task-01 Requirement: Create table can support specify sort scope Implement: Take use of table properties (Map), will specify sort scope in table properties by key/value pair, then existing interface will be called to write this key/value pair into metastore. Will support Global Sort,Local Sort and No Sort,it can be specified in sql command: CREATE TABLE tableWithGlobalSort ( shortField SHORT, intField INT, bigintField LONG, doubleField DOUBLE, stringField STRING, timestampField TIMESTAMP, decimalField DECIMAL(18,2), dateField DATE, charField CHAR(5) ) STORED BY 'carbondata' TBLPROPERTIES('SORT_COLUMNS'='stringField', 'SORT_SCOPE'='GLOBAL_SORT') Tips:If the sort scope is global Sort, users should specify GLOBAL_SORT_PARTITIONS. If users do not specify it, it will use the number of map task. GLOBAL_SORT_PARTITIONS should be Integer type, the range is [1,Integer.MaxValue],it is only used when the sort scope is global sort. Global Sort Use orderby operator in spark, data is ordered in segment level. Local Sort Node ordered, carbondata file is ordered if it is written by one task. No Sort No sort Tips:key and value is case-insensitive. 2.2 Task-02 Requirement: Load data in will support local sort, no sort, global sort Ignore the sort scope specified in load data and use the parameter which specified in create table. Currently, user can specify the sort scope and global sort partitions in load options, After modification, it will ignore the sort scope which specified in load options and will get sort scope from table properties. Current logic: sort scope is from load options Number PrerequisiteSort scope 1 isSortTable is true && Sort Scope is Global SortGlobal Sort(first check) 2 isSortTable is falseNo Sort 3 isSortTable is true Local Sort Tips: isSortTable is true means this table contains sort column or it contains dimensions (except complex type), like string type. For example: Create table xxx1 (col1 string col2 int) stored by ‘carbondata’ --- sort table Create table xx1 (col1 int, col2 int) stored by ‘carbondata’ --- not sort table Create table xx (col1 int, col2 string) stored by ‘carbondata’ tblproperties (‘sort_column’=’col1’) –- sort table New logic:sort scope is from create table Number PrerequisiteCode branch 1 isSortTable = true && Sort Scope is Global Sort Global Sort(first check) 2 isSortTable= false || Sort Scope is No Sort No Sort 3 isSortTable is true && Sort Scope is Local Sort Local Sort 4 isSortTable is true,without specify Sort Scope Local Sort, (Keep current logic) 3 Acceptance standard Number Acceptance standard 1 Use can specify sort scope(global, local, no sort) when create carbon table in sql type 2 Load data will ignore the sort scope specified in load options and will use the parameter which specify in create table command. If user still specify the sort scope in load options, will give warning and inform user that he will use the sort scope which specified in create table. 4 Feature restrictions NA 5 Dependencies NA 6 Technical risk NA -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1403) Compaction log is not correct
chenerlu created CARBONDATA-1403: Summary: Compaction log is not correct Key: CARBONDATA-1403 URL: https://issues.apache.org/jira/browse/CARBONDATA-1403 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (CARBONDATA-1376) Fix warn message when setting LOCK_TYPE to HDFSLOCK
[ https://issues.apache.org/jira/browse/CARBONDATA-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu reassigned CARBONDATA-1376: Assignee: chenerlu > Fix warn message when setting LOCK_TYPE to HDFSLOCK > --- > > Key: CARBONDATA-1376 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1376 > Project: CarbonData > Issue Type: Improvement > Components: core >Reporter: Liang Chen >Assignee: chenerlu >Priority: Minor > > scala> > CarbonProperties.getInstance().addProperty(CarbonCommonConstants.LOCK_TYPE, > "HDFSLOCK") > 17/08/13 20:21:38 WARN CarbonProperties: main The value "null" configured for > key carbon.lock.type" is invalid. Using the default value "LOCALLOCK > res0: org.apache.carbondata.core.util.CarbonProperties = > org.apache.carbondata.core.util.CarbonProperties@7730da00 > The below WARN message is not correct, need to optimize. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1265) Fix AllDictionaryExample because it is only supported when single_pass is true
chenerlu created CARBONDATA-1265: Summary: Fix AllDictionaryExample because it is only supported when single_pass is true Key: CARBONDATA-1265 URL: https://issues.apache.org/jira/browse/CARBONDATA-1265 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1264) Fix AllDictionaryExample because it is only supported when single_pass is true
chenerlu created CARBONDATA-1264: Summary: Fix AllDictionaryExample because it is only supported when single_pass is true Key: CARBONDATA-1264 URL: https://issues.apache.org/jira/browse/CARBONDATA-1264 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1251) Add test cases for IUD feature
chenerlu created CARBONDATA-1251: Summary: Add test cases for IUD feature Key: CARBONDATA-1251 URL: https://issues.apache.org/jira/browse/CARBONDATA-1251 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-995) Incorrect result displays while using variance aggregate function in presto integration
[ https://issues.apache.org/jira/browse/CARBONDATA-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068075#comment-16068075 ] chenerlu commented on CARBONDATA-995: - Hi, What is behave of same operation in hive ? > Incorrect result displays while using variance aggregate function in presto > integration > --- > > Key: CARBONDATA-995 > URL: https://issues.apache.org/jira/browse/CARBONDATA-995 > Project: CarbonData > Issue Type: Bug > Components: data-query, presto-integration >Affects Versions: 1.1.0 > Environment: spark 2.1 , presto 0.166 >Reporter: Vandana Yadav >Priority: Minor > Attachments: 2000_UniqData.csv > > > Incorrect result displays while using variance aggregate function in presto > integration > Steps to reproduce : > 1. In CarbonData: > a) Create table: > CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES > ("TABLE_BLOCKSIZE"= "256 MB"); > b) Load data : > LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table > uniqdata OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > 2. In presto > a) Execute the query: > select variance(DECIMAL_COLUMN1) as a from (select DECIMAL_COLUMN1 from > UNIQDATA order by DECIMAL_COLUMN1) t > Actual result : > In CarbonData : > "++--+ > | a | > ++--+ > | 333832.4983039884 | > ++--+ > 1 row selected (0.695 seconds) > " > in presto: > " a > --- > 333832.3010442859 > (1 row) > Query 20170420_082837_00062_hd7jy, FINISHED, 1 node > Splits: 35 total, 35 done (100.00%) > 0:00 [2.01K rows, 1.97KB] [8.09K rows/s, 7.91KB/s]" > Expected result: it should display the same result as showing in CarbonData. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1227) Remove useless TableCreator
chenerlu created CARBONDATA-1227: Summary: Remove useless TableCreator Key: CARBONDATA-1227 URL: https://issues.apache.org/jira/browse/CARBONDATA-1227 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (CARBONDATA-1203) insert data caused many duplicated data on spark 1.6.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060325#comment-16060325 ] chenerlu edited comment on CARBONDATA-1203 at 6/23/17 2:36 AM: --- Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (\*) FROM t3") Actual result: t3 will have 20 records. (20 records = 10 + 10, the second '10' is because t3 has 10 records, if we change t3 to t4 which have 5 records, the result will be 15, so I think carbondata handle constant as '\*', not sure, this should be confirm). Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] [~chenliang613] was (Author: chenerlu): Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (\*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] [~chenliang613] > insert data caused many duplicated data on spark 1.6.2 > --- > > Key: CARBONDATA-1203 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1203 > Project: CarbonData > Issue Type: Bug >Reporter: Jarck > > I use branch-1.1 do insert test on spark 1.6.2 in my local machine > I try to run the sql as below to insert a data > spark.sql(s""" > insert into $tableName select $id,'$date','$country','$testName' > ,'$phoneType','$serialname',$salary from $tableName > """).show() > at last the data has been inserted successfully, but it inserted many > duplicated data -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (CARBONDATA-1203) insert data caused many duplicated data on spark 1.6.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060325#comment-16060325 ] chenerlu edited comment on CARBONDATA-1203 at 6/23/17 2:24 AM: --- Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (\*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] [~chenliang613] was (Author: chenerlu): Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (\*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] > insert data caused many duplicated data on spark 1.6.2 > --- > > Key: CARBONDATA-1203 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1203 > Project: CarbonData > Issue Type: Bug >Reporter: Jarck > > I use branch-1.1 do insert test on spark 1.6.2 in my local machine > I try to run the sql as below to insert a data > spark.sql(s""" > insert into $tableName select $id,'$date','$country','$testName' > ,'$phoneType','$serialname',$salary from $tableName > """).show() > at last the data has been inserted successfully, but it inserted many > duplicated data -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (CARBONDATA-1203) insert data caused many duplicated data on spark 1.6.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060325#comment-16060325 ] chenerlu edited comment on CARBONDATA-1203 at 6/23/17 2:23 AM: --- Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (\*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] was (Author: chenerlu): Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] > insert data caused many duplicated data on spark 1.6.2 > --- > > Key: CARBONDATA-1203 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1203 > Project: CarbonData > Issue Type: Bug >Reporter: Jarck > > I use branch-1.1 do insert test on spark 1.6.2 in my local machine > I try to run the sql as below to insert a data > spark.sql(s""" > insert into $tableName select $id,'$date','$country','$testName' > ,'$phoneType','$serialname',$salary from $tableName > """).show() > at last the data has been inserted successfully, but it inserted many > duplicated data -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-1203) insert data caused many duplicated data on spark 1.6.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060325#comment-16060325 ] chenerlu commented on CARBONDATA-1203: -- Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count(*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] > insert data caused many duplicated data on spark 1.6.2 > --- > > Key: CARBONDATA-1203 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1203 > Project: CarbonData > Issue Type: Bug >Reporter: Jarck > > I use branch-1.1 do insert test on spark 1.6.2 in my local machine > I try to run the sql as below to insert a data > spark.sql(s""" > insert into $tableName select $id,'$date','$country','$testName' > ,'$phoneType','$serialname',$salary from $tableName > """).show() > at last the data has been inserted successfully, but it inserted many > duplicated data -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (CARBONDATA-1203) insert data caused many duplicated data on spark 1.6.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060325#comment-16060325 ] chenerlu edited comment on CARBONDATA-1203 at 6/23/17 2:22 AM: --- Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count (*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] was (Author: chenerlu): Hi, I encounter same problem. Issue can be summarized as follows. Step 1: create a carbon table. cc.sql("CREATE TABLE IF NOT EXISTS t3 (id Int, name String) STORED BY 'carbondata'") Step 2: load data, then t3 will have 10 records cc.sql("LOAD DATA LOCAL INPATH 'mypathofdata' INTO TABLE t3 ") Step 3: insert constant into table t3 cc.sql("INSERT INTO TABLE t3 SELECT 1, 'jack' FROM t3") Step4: count table t3 cc.sql("SELECT count(*) FROM t3") Actual result: t3 will have 20 records. Expected result: t3 should have 11 records or throw sql.AnalysisException (This will be same as Hive table I think) Any idea about this issue, which solution is better ? [~ravi.pesala] > insert data caused many duplicated data on spark 1.6.2 > --- > > Key: CARBONDATA-1203 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1203 > Project: CarbonData > Issue Type: Bug >Reporter: Jarck > > I use branch-1.1 do insert test on spark 1.6.2 in my local machine > I try to run the sql as below to insert a data > spark.sql(s""" > insert into $tableName select $id,'$date','$country','$testName' > ,'$phoneType','$serialname',$salary from $tableName > """).show() > at last the data has been inserted successfully, but it inserted many > duplicated data -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-1201) don't support insert syntax "insert into table select constants" on spark 1.6.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056904#comment-16056904 ] chenerlu commented on CARBONDATA-1201: -- I remember this syntax may not support on spark1.6.2, while spark2.1 support. So first we can confirm if that spark issue. > don't support insert syntax "insert into table select constants" on spark > 1.6.2 > -- > > Key: CARBONDATA-1201 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1201 > Project: CarbonData > Issue Type: Bug >Reporter: Jarck > > I use branch-1.1 do insert test on spark 1.6.2 in my local machine > I try to run the sql like "insert into table select constants", but it > failed > it works on spark 2.1. > example sql: > spark.sql(s""" > insert into $tableName select $id,'$date','$country','$testName' > ,'$phoneType','$serialname',$salary > """).show() > error log as below > FailedPredicateException(regularBody,{$s.tree.getChild(1) !=null}?) > at > org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:41238) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:40413) > at > org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:40283) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1204) Update operation fail and generate extra records when test with big data
chenerlu created CARBONDATA-1204: Summary: Update operation fail and generate extra records when test with big data Key: CARBONDATA-1204 URL: https://issues.apache.org/jira/browse/CARBONDATA-1204 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1197) Update related docs which still use incubating such as presto integration
[ https://issues.apache.org/jira/browse/CARBONDATA-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1197: - Description: Update related docs which still use incubating. Just update the references links, file name, directory name, etc. Summary: Update related docs which still use incubating such as presto integration (was: Update related docs which still use incubating such as presto integra) > Update related docs which still use incubating such as presto integration > - > > Key: CARBONDATA-1197 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1197 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > > Update related docs which still use incubating. > Just update the references links, file name, directory name, etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (CARBONDATA-1197) Update related docs which still use incubating such as presto integra
[ https://issues.apache.org/jira/browse/CARBONDATA-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu reassigned CARBONDATA-1197: Assignee: chenerlu > Update related docs which still use incubating such as presto integra > - > > Key: CARBONDATA-1197 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1197 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1197) Update related docs which still use incubating such as presto integra
chenerlu created CARBONDATA-1197: Summary: Update related docs which still use incubating such as presto integra Key: CARBONDATA-1197 URL: https://issues.apache.org/jira/browse/CARBONDATA-1197 Project: CarbonData Issue Type: Bug Reporter: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (CARBONDATA-1180) loading data failed for dictionary file id is locked for updation
[ https://issues.apache.org/jira/browse/CARBONDATA-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053607#comment-16053607 ] chenerlu commented on CARBONDATA-1180: -- Is this always happen ? Could you please remove carbondata_test related metafiles and retry ? > loading data failed for dictionary file id is locked for updation > --- > > Key: CARBONDATA-1180 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1180 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.2.0 >Reporter: Liu Shaohui > > use Spark 2.1 in yarn-client mode and query from beeline to spark sql > thriftserver > {code} > CREATE TABLE IF NOT EXISTS carbondata_test(id string, name string, city > string, age Int) STORED BY 'carbondata'; > LOAD DATA INPATH 'hdfs:///user/sample-data/sample.csv' INTO TABLE > carbondata_test; > {code} > Data load is failed for following exception. > {code} > java.lang.RuntimeException: Dictionary file id is locked for updation. Please > try after some time +details > java.lang.RuntimeException: Dictionary file id is locked for updation. Please > try after some time > at scala.sys.package$.error(package.scala:27) > at > org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD$$anon$1.(CarbonGlobalDictionaryRDD.scala:407) > at > org.apache.carbondata.spark.rdd.CarbonGlobalDictionaryGenerateRDD.compute(CarbonGlobalDictionaryRDD.scala:345) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:99) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {code} > The 1.2.0 contains the fix in CARBONDATA-614. > Any suggestion about this problem? Thanks~ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1191) Remove carbon-spark-shell script
chenerlu created CARBONDATA-1191: Summary: Remove carbon-spark-shell script Key: CARBONDATA-1191 URL: https://issues.apache.org/jira/browse/CARBONDATA-1191 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1183) Update CarbonPartitionTable because partition columns should not be specified in the schema
[ https://issues.apache.org/jira/browse/CARBONDATA-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1183: - Summary: Update CarbonPartitionTable because partition columns should not be specified in the schema (was: Update CarbonPartitionTable Because partition columns should not be specified in the schema) > Update CarbonPartitionTable because partition columns should not be specified > in the schema > --- > > Key: CARBONDATA-1183 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1183 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1183) Update CarbonPartitionTable Because partition columns should not be specified in the schema
chenerlu created CARBONDATA-1183: Summary: Update CarbonPartitionTable Because partition columns should not be specified in the schema Key: CARBONDATA-1183 URL: https://issues.apache.org/jira/browse/CARBONDATA-1183 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1149) Fix issue of mismatch type of partition column when specify partition info and range info overlapping values issue
[ https://issues.apache.org/jira/browse/CARBONDATA-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1149: - Summary: Fix issue of mismatch type of partition column when specify partition info and range info overlapping values issue (was: Fix issue of mismatch type of partition column when specify partition info) > Fix issue of mismatch type of partition column when specify partition info > and range info overlapping values issue > -- > > Key: CARBONDATA-1149 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1149 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1151) Update useful-tips-on-carbondata.md
chenerlu created CARBONDATA-1151: Summary: Update useful-tips-on-carbondata.md Key: CARBONDATA-1151 URL: https://issues.apache.org/jira/browse/CARBONDATA-1151 Project: CarbonData Issue Type: Bug Reporter: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-1149) Fix issue of mismatch type of partition column when specify partition info
chenerlu created CARBONDATA-1149: Summary: Fix issue of mismatch type of partition column when specify partition info Key: CARBONDATA-1149 URL: https://issues.apache.org/jira/browse/CARBONDATA-1149 Project: CarbonData Issue Type: Bug Reporter: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-1134) Generate redundant folders under integration model when run test cases with mvn command in spark1.6
chenerlu created CARBONDATA-1134: Summary: Generate redundant folders under integration model when run test cases with mvn command in spark1.6 Key: CARBONDATA-1134 URL: https://issues.apache.org/jira/browse/CARBONDATA-1134 Project: CarbonData Issue Type: Bug Reporter: chenerlu Priority: Minor When run mvn -Pspark-1.6 -Dspark.version=1.6.3 clean package, it will generate redundant folders under integration model. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CARBONDATA-1115) load csv data fail
[ https://issues.apache.org/jira/browse/CARBONDATA-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034242#comment-16034242 ] chenerlu commented on CARBONDATA-1115: -- Hi, make sure you specify right carbon store path and your sample.csv has column header in data file. > load csv data fail > -- > > Key: CARBONDATA-1115 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1115 > Project: CarbonData > Issue Type: Bug > Components: examples >Affects Versions: 1.2.0 > Environment: centos 7, spark2.1.0, hadoop 2.7 >Reporter: hyd > Fix For: 1.2.0 > > > is it a bug, or my environment has problem, can anyone help me. > [root@localhost spark-2.1.0-bin-hadoop2.7]# ls /home/carbondata/sample.csv > /home/carbondata/sample.csv > [root@localhost spark-2.1.0-bin-hadoop2.7]# ./bin/spark-shell --master > spark://192.168.32.114:7077 --total-executor-cores 2 --executor-memory 2G > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/spark-2.1.0-bin-hadoop2.7/carbonlib/carbondata_2.11-1.1.0-shade-hadoop2.2.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/spark-2.1.0-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 17/06/01 14:44:54 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 17/06/01 14:44:54 WARN SparkConf: > SPARK_CLASSPATH was detected (set to './carbonlib/*'). > This is deprecated in Spark 1.0+. > Please instead use: > - ./spark-submit with --driver-class-path to augment the driver classpath > - spark.executor.extraClassPath to augment the executor classpath > > 17/06/01 14:44:54 WARN SparkConf: Setting 'spark.executor.extraClassPath' to > './carbonlib/*' as a work-around. > 17/06/01 14:44:54 WARN SparkConf: Setting 'spark.driver.extraClassPath' to > './carbonlib/*' as a work-around. > 17/06/01 14:44:54 WARN Utils: Your hostname, localhost.localdomain resolves > to a loopback address: 127.0.0.1; using 192.168.32.114 instead (on interface > em1) > 17/06/01 14:44:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > 17/06/01 14:44:59 WARN ObjectStore: Failed to get database global_temp, > returning NoSuchObjectException > Spark context Web UI available at http://192.168.32.114:4040 > Spark context available as 'sc' (master = spark://192.168.32.114:7077, app id > = app-20170601144454-0001). > Spark session available as 'spark'. > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/___/ .__/\_,_/_/ /_/\_\ version 2.1.0 > /_/ > > Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121) > Type in expressions to have them evaluated. > Type :help for more information. > scala> import org.apache.spark.sql.SparkSession > import org.apache.spark.sql.SparkSession > scala> import org.apache.spark.sql.CarbonSession._ > import org.apache.spark.sql.CarbonSession._ > scala> val carbon = > SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://192.168.32.114/test") > 17/06/01 14:45:35 WARN SparkContext: Using an existing SparkContext; some > configuration may not take effect. > 17/06/01 14:45:38 WARN ObjectStore: Failed to get database global_temp, > returning NoSuchObjectException > carbon: org.apache.spark.sql.SparkSession = > org.apache.spark.sql.CarbonSession@2165b170 > scala> carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string, name > string, city string, age Int) STORED BY 'carbondata'") > 17/06/01 14:45:45 AUDIT CreateTable: > [localhost.localdomain][root][Thread-1]Creating Table with Database name > [default] and Table name [test_table] > res0: org.apache.spark.sql.DataFrame = [] > scala> carbon.sql("LOAD DATA LOCAL INPATH '/home/carbondata/sample.csv' INTO > TABLE test_table") > 17/06/01 14:45:54 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, > 192.168.32.114, executor 0): java.lang.ClassCastException: cannot assign > instance of scala.collection.immutable.List$SerializationProxy to field > org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type > scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD > at > java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133) > at > java.io.Ob
[jira] [Commented] (CARBONDATA-1116) Not able to connect with Carbonsession while starting carbon spark shell and beeline
[ https://issues.apache.org/jira/browse/CARBONDATA-1116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032668#comment-16032668 ] chenerlu commented on CARBONDATA-1116: -- Hi, I met same issue when I ran CarbonSessionExample on latest master branch. This issue may caused by creating a new SparkSqlParser with null as its parameter. Please help check and see if same problem. [~ravi.pesala] Thanks > Not able to connect with Carbonsession while starting carbon spark shell and > beeline > > > Key: CARBONDATA-1116 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1116 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.2.0 > Environment: spark 2.1 >Reporter: Vandana Yadav >Priority: Blocker > > Not able to connect with Carbonsession while starting carbon spark shell and > beeline > Steps to reproduce: > 1)Start thrift-server > a) cd $SPARK-HOME/bin > b) ./spark-submit --conf spark.sql.hive.thriftServer.singleSession=true > --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer > /opt/spark/spark-2.1/carbonlib/carbondata_2.11-1.1.0-SNAPSHOT-shade-hadoop2.7.3.jar > hdfs://localhost:54310/opt/prestocarbonStore > 2)Start Beeline > a) cd $SPARK-HOME/bin > b)./beeline > 3) Connect with carbondata via jdbc > !connect jdbc:hive2://localhost:1 > Enter username for jdbc:hive2://localhost:1: hduser > Enter password for jdbc:hive2://localhost:1: ** > 4) Actual Result: > Error: Could not establish connection to jdbc:hive2://localhost:1: null > (state=08S01,code=0) > 0: jdbc:hive2://localhost:1 (closed)> > 5) Expected result : it should connect successfully with carbondata > 6)console logs: > 17/06/01 13:03:27 INFO ThriftCLIService: Client protocol version: > HIVE_CLI_SERVICE_PROTOCOL_V8 > 17/06/01 13:03:27 INFO SessionState: Created local directory: > /tmp/addaba65-46c5-4467-a02f-2bbdfd54329a_resources > 17/06/01 13:03:27 INFO SessionState: Created HDFS directory: > /tmp/hive/hduser/addaba65-46c5-4467-a02f-2bbdfd54329a > 17/06/01 13:03:27 INFO SessionState: Created local directory: > /tmp/hduser/addaba65-46c5-4467-a02f-2bbdfd54329a > 17/06/01 13:03:27 INFO SessionState: Created HDFS directory: > /tmp/hive/hduser/addaba65-46c5-4467-a02f-2bbdfd54329a/_tmp_space.db > 17/06/01 13:03:27 INFO HiveSessionImpl: Operation log session directory is > created: /tmp/hduser/operation_logs/addaba65-46c5-4467-a02f-2bbdfd54329a > 17/06/01 13:03:27 INFO CarbonSparkSqlParser: Parsing command: use default > Exception in thread "HiveServer2-Handler-Pool: Thread-84" > java.lang.ExceptionInInitializerError > at > org.apache.spark.sql.hive.CarbonSessionState$$anon$1.(CarbonSessionState.scala:133) > at > org.apache.spark.sql.hive.CarbonSessionState.analyzer$lzycompute(CarbonSessionState.scala:128) > at > org.apache.spark.sql.hive.CarbonSessionState.analyzer(CarbonSessionState.scala:127) > at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:63) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLSessionManager.openSession(SparkSQLSessionManager.scala:83) > at > org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:202) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:351) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:246) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1253) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1238) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.spark.sql.hive.CarbonIUDAnalysisRule$.(CarbonAnalysisRules.scala:90) > at > org.apache.spark.sql.hive.CarbonIUDAnalysisRule$.(CarbonAnalysisRules.scala) > ... 20 more
[jira] [Commented] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16019445#comment-16019445 ] chenerlu commented on CARBONDATA-1076: -- Yes, I have reproduced this problem with csv file. Data in csv file ias follows: col1,col2,col3 1,2,3 4,5,6 7,8,9 > Join Issue caused by dictionary and shuffle exchange > > > Key: CARBONDATA-1076 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1076 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 0.1.1-incubating, 1.1.0 > Environment: Carbon + spark 2.1 >Reporter: chenerlu >Assignee: Ravindra Pesala > > We can reproduce this issue as following steps: > Step1: create a carbon table > > carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 > int) STORED by 'carbondata' > TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") > > Step2: load data > carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE > carbon_table") > data in file carbon_table as follows: > col1,col2,col3 > 1,2,3 > 4,5,6 > 7,8,9 > > Step3: do the query > carbon.sql("SELECT c1.col1,c2.col1,c2.col3 FROM (SELECT col1,col2 FROM > carbon_table GROUP BY col1,col2) c1 FULL JOIN (SELECT col1,count(col2) as > col3 FROM carbon_table GROUP BY col1) c2 ON c1.col1 = c2.col1").show() > [expected] Hive table and parquet table get same result as below and it > should be correct. > |col1|col1|col3| > | 1| 1| 1| > | 4| 4| 1| > | 7| 7| 1| > [acutally] carbon will get null because wrong match. > |col1|col1|col3| > | 1|null|null| > |null| 4| 1| > | 4|null|null| > |null| 7| 1| > | 7|null|null| > |null| 1| 1| > Root cause analysis: > > It is because this query has two subquery, and one subquey do the decode > after exchange and the other subquery do the decode before exchange, and this > may lead to wrong match when execute full join. > > My idea: Can we move decode before exchange ? Because I am not very familiar > with Carbon query, so any idea about this ? > Plan as follows: > > == Physical Plan == > SortMergeJoin [col1#3445], [col1#3460], FullOuter > :- Sort [col1#3445 ASC NULLS FIRST], false, 0 > : +- Exchange hashpartitioning(col1#3445, 200) > : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> > col1#3445, col2#3446 -> col2#3446, col3#3447 -> > col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table > name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), > CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, > col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], > IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), > org.apache.spark.sql.CarbonSession@69e87cbe > :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], > output=[col1#3445]) > : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) > : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], > output=[col1#3445, col2#3446]) > : +- Scan CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] > tempdev.carbon_table[col1#3445,col2#3446] > +- Sort [col1#3460 ASC NULLS FIRST], false, 0 >+- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> > col1#3445, col2#3446 -> col2#3446, col3#3447 -> > col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table > name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), > CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, > col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], > IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), > org.apache.spark.sql.CarbonSession@69e87cbe > +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], > output=[col1#3460, col3#3436L]) > +- Exchange hashpartitioning(col1#3460, 200) > +- HashAggregate(keys=[col1#3460], > functions=[partial_count(col
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Description: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") data in file carbon_table as follows: col1,col2,col3 1,2,3 4,5,6 7,8,9 Step3: do the query carbon.sql("SELECT c1.col1,c2.col1,c2.col3 FROM (SELECT col1,col2 FROM carbon_table GROUP BY col1,col2) c1 FULL JOIN (SELECT col1,count(col2) as col3 FROM carbon_table GROUP BY col1) c2 ON c1.col1 = c2.col1").show() [expected] Hive table and parquet table get same result as below and it should be correct. |col1|col1|col3| | 1| 1| 1| | 4| 4| 1| | 7| 7| 1| [acutally] carbon will get null because wrong match. |col1|col1|col3| | 1|null|null| |null| 4| 1| | 4|null|null| |null| 7| 1| | 7|null|null| |null| 1| 1| Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? Plan as follows: == Physical Plan == SortMergeJoin [col1#3445], [col1#3460], FullOuter :- Sort [col1#3445 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(col1#3445, 200) : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445]) : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445, col2#3446]) : +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] tempdev.carbon_table[col1#3445,col2#3446] +- Sort [col1#3460 ASC NULLS FIRST], false, 0 +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], output=[col1#3460, col3#3436L]) +- Exchange hashpartitioning(col1#3460, 200) +- HashAggregate(keys=[col1#3460], functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L]) +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col2#3461)), CarbonAliasDecoderRelation(), org.apa
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Description: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") data in file carbon_table as follows: col1,col2,col3 1,2,3 4,5,6 7,8,9 you can get carbon_table file in attachment. Step3: do the query carbon.sql("SELECT c1.col1,c2.col1,c2.col3 FROM (SELECT col1,col2 FROM carbon_table GROUP BY col1,col2) c1 FULL JOIN (SELECT col1,count(col2) as col3 FROM carbon_table GROUP BY col1) c2 ON c1.col1 = c2.col1").show() [expected] Hive table and parquet table get same result as below and it should be correct. |col1|col1|col3| | 1|null|null| |null| 4| 1| | 4|null|null| |null| 7| 1| | 7|null|null| |null| 1| 1| [acutally] carbon will get null because wrong match. |col1|col1|col3| | 1| 1| 1| | 4| 4| 1| | 7| 7| 1| Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? Plan as follows: == Physical Plan == SortMergeJoin [col1#3445], [col1#3460], FullOuter :- Sort [col1#3445 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(col1#3445, 200) : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445]) : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445, col2#3446]) : +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] tempdev.carbon_table[col1#3445,col2#3446] +- Sort [col1#3460 ASC NULLS FIRST], false, 0 +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], output=[col1#3460, col3#3436L]) +- Exchange hashpartitioning(col1#3460, 200) +- HashAggregate(keys=[col1#3460], functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L]) +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(c
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Description: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") data in file carbon_table as follows: col1,col2,col3 1,2,3 4,5,6 7,8,9 Step3: do the query carbon.sql("SELECT c1.col1,c2.col1,c2.col3 FROM (SELECT col1,col2 FROM carbon_table GROUP BY col1,col2) c1 FULL JOIN (SELECT col1,count(col2) as col3 FROM carbon_table GROUP BY col1) c2 ON c1.col1 = c2.col1").show() [expected] Hive table and parquet table get same result as below and it should be correct. |col1|col1|col3| | 1|null|null| |null| 4| 1| | 4|null|null| |null| 7| 1| | 7|null|null| |null| 1| 1| [acutally] carbon will get null because wrong match. |col1|col1|col3| | 1| 1| 1| | 4| 4| 1| | 7| 7| 1| Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? Plan as follows: == Physical Plan == SortMergeJoin [col1#3445], [col1#3460], FullOuter :- Sort [col1#3445 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(col1#3445, 200) : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445]) : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445, col2#3446]) : +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] tempdev.carbon_table[col1#3445,col2#3446] +- Sort [col1#3460 ASC NULLS FIRST], false, 0 +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], output=[col1#3460, col3#3436L]) +- Exchange hashpartitioning(col1#3460, 200) +- HashAggregate(keys=[col1#3460], functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L]) +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col2#3461)), CarbonAliasDecoderRelation(), org.
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Affects Version/s: 0.1.1-incubating 1.1.0 Request participants: (was: ) Component/s: core > Join Issue caused by dictionary and shuffle exchange > > > Key: CARBONDATA-1076 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1076 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 0.1.1-incubating, 1.1.0 > Environment: Carbon + spark 2.1 >Reporter: chenerlu >Assignee: Ravindra Pesala > > We can reproduce this issue as following steps: > Step1: create a carbon table > > carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 > int) STORED by 'carbondata' > TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") > > Step2: load data > carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE > carbon_table") > > you can get carbon_table file in attachment. > > Step3: do the query > > [expected] Hive table and parquet table get same result as below and it > should be correct. > |col1|col1|col3| > | 1|null|null| > |null| 4| 1| > | 4|null|null| > |null| 7| 1| > | 7|null|null| > |null| 1| 1| > > [acutally] carbon will get null because wrong match. > |col1|col1|col3| > | 1| 1| 1| > | 4| 4| 1| > | 7| 7| 1| > Root cause analysis: > > It is because this query has two subquery, and one subquey do the decode > after exchange and the other subquery do the decode before exchange, and this > may lead to wrong match when execute full join. > > My idea: Can we move decode before exchange ? Because I am not very familiar > with Carbon query, so any idea about this ? > Plan as follows: > > == Physical Plan == > SortMergeJoin [col1#3445], [col1#3460], FullOuter > :- Sort [col1#3445 ASC NULLS FIRST], false, 0 > : +- Exchange hashpartitioning(col1#3445, 200) > : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> > col1#3445, col2#3446 -> col2#3446, col3#3447 -> > col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table > name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), > CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, > col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], > IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), > org.apache.spark.sql.CarbonSession@69e87cbe > :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], > output=[col1#3445]) > : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) > : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], > output=[col1#3445, col2#3446]) > : +- Scan CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] > tempdev.carbon_table[col1#3445,col2#3446] > +- Sort [col1#3460 ASC NULLS FIRST], false, 0 >+- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> > col1#3445, col2#3446 -> col2#3446, col3#3447 -> > col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table > name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), > CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, > col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table, Schema > :Some(StructType(StructField(col1,IntegerType,true), > StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], > IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), > org.apache.spark.sql.CarbonSession@69e87cbe > +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], > output=[col1#3460, col3#3436L]) > +- Exchange hashpartitioning(col1#3460, 200) > +- HashAggregate(keys=[col1#3460], > functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L]) >+- CarbonDictionaryDecoder > [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, > col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name > :tempdev, Table name :carbon_table
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Description: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") you can get carbon_table file in attachment. Step3: do the query [expected] Hive table and parquet table get same result as below and it should be correct. |col1|col1|col3| | 1|null|null| |null| 4| 1| | 4|null|null| |null| 7| 1| | 7|null|null| |null| 1| 1| [acutally] carbon will get null because wrong match. |col1|col1|col3| | 1| 1| 1| | 4| 4| 1| | 7| 7| 1| Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? Plan as follows: == Physical Plan == SortMergeJoin [col1#3445], [col1#3460], FullOuter :- Sort [col1#3445 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(col1#3445, 200) : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445]) : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445, col2#3446]) : +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] tempdev.carbon_table[col1#3445,col2#3446] +- Sort [col1#3460 ASC NULLS FIRST], false, 0 +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], output=[col1#3460, col3#3436L]) +- Exchange hashpartitioning(col1#3460, 200) +- HashAggregate(keys=[col1#3460], functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L]) +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col2#3461)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,t
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Description: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") you can get carbon_table file in attachment. Step3: do the query [expected] Hive table and parquet table get same result as below and it should be correct. ++++ |col1|col1|col3| ++++ | 1|null|null| |null| 4| 1| | 4|null|null| |null| 7| 1| | 7|null|null| |null| 1| 1| ++++ [acutally] carbon will get null because wrong match. ++++ |col1|col1|col3| ++++ | 1| 1| 1| | 4| 4| 1| | 7| 7| 1| ++++ Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? Plan as follows: == Physical Plan == SortMergeJoin [col1#3445], [col1#3460], FullOuter :- Sort [col1#3445 ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(col1#3445, 200) : +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3445)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe :+- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445]) : +- Exchange hashpartitioning(col1#3445, col2#3446, 200) : +- HashAggregate(keys=[col1#3445, col2#3446], functions=[], output=[col1#3445, col2#3446]) : +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ] tempdev.carbon_table[col1#3445,col2#3446] +- Sort [col1#3460 ASC NULLS FIRST], false, 0 +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col1#3460)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- HashAggregate(keys=[col1#3460], functions=[count(col2#3461)], output=[col1#3460, col3#3436L]) +- Exchange hashpartitioning(col1#3460, 200) +- HashAggregate(keys=[col1#3460], functions=[partial_count(col2#3461)], output=[col1#3460, count#3472L]) +- CarbonDictionaryDecoder [CarbonDecoderRelation(Map(col1#3445 -> col1#3445, col2#3446 -> col2#3446, col3#3447 -> col3#3447),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ]), CarbonDecoderRelation(Map(col1#3460 -> col1#3460, col2#3461 -> col2#3461, col3#3462 -> col3#3462),CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :carbon_table, Schema :Some(StructType(StructField(col1,IntegerType,true), StructField(col2,IntegerType,true), StructField(col3,IntegerType,true))) ])], IncludeProfile(ArrayBuffer(col2#3461)), CarbonAliasDecoderRelation(), org.apache.spark.sql.CarbonSession@69e87cbe +- Scan CarbonDatasourceHadoopRelation [ Database name :tempdev, Table name :
[jira] [Updated] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
[ https://issues.apache.org/jira/browse/CARBONDATA-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu updated CARBONDATA-1076: - Description: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") you can get carbon_table file in attachment. Step3: do the query [expected] Hive table and parquet table get same result as below and it should be correct. [acutally] carbon will get null because wrong match Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? was: We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") you can get carbon_table file in attachment. Step3: do the query [expected] Hive table and parquet table get same result as below and it should be correct. [acutally] carbon will get null because wrong match Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? > Join Issue caused by dictionary and shuffle exchange > > > Key: CARBONDATA-1076 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1076 > Project: CarbonData > Issue Type: Bug > Environment: Carbon + spark 2.1 >Reporter: chenerlu >Assignee: Ravindra Pesala > > We can reproduce this issue as following steps: > Step1: create a carbon table > > carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 > int) STORED by 'carbondata' > TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") > > Step2: load data > carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE > carbon_table") > > you can get carbon_table file in attachment. > > Step3: do the query > > [expected] Hive table and parquet table get same result as below and it > should be correct. > > > [acutally] carbon will get null because wrong match > > > Root cause analysis: > > It is because this query has two subquery, and one subquey do the decode > after exchange and the other subquery do the decode before exchange, and this > may lead to wrong match when execute full join. > > My idea: Can we move decode before exchange ? Because I am not very familiar > with Carbon query, so any idea about this ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-1076) Join Issue caused by dictionary and shuffle exchange
chenerlu created CARBONDATA-1076: Summary: Join Issue caused by dictionary and shuffle exchange Key: CARBONDATA-1076 URL: https://issues.apache.org/jira/browse/CARBONDATA-1076 Project: CarbonData Issue Type: Bug Environment: Carbon + spark 2.1 Reporter: chenerlu Assignee: Ravindra Pesala We can reproduce this issue as following steps: Step1: create a carbon table carbon.sql("CREATE TABLE IF NOT EXISTS carbon_table (col1 int, col2 int, col3 int) STORED by 'carbondata' TBLPROPERTIES('DICTIONARY_INCLUDE'='col1,col2,col3','TABLE_BLOCKSIZE'='4')") Step2: load data carbon.sql("LOAD DATA LOCAL INPATH '/opt/carbon_table' INTO TABLE carbon_table") you can get carbon_table file in attachment. Step3: do the query [expected] Hive table and parquet table get same result as below and it should be correct. [acutally] carbon will get null because wrong match Root cause analysis: It is because this query has two subquery, and one subquey do the decode after exchange and the other subquery do the decode before exchange, and this may lead to wrong match when execute full join. My idea: Can we move decode before exchange ? Because I am not very familiar with Carbon query, so any idea about this ? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-1040) Add description of carbon not support update table and delete records in spark2.1
chenerlu created CARBONDATA-1040: Summary: Add description of carbon not support update table and delete records in spark2.1 Key: CARBONDATA-1040 URL: https://issues.apache.org/jira/browse/CARBONDATA-1040 Project: CarbonData Issue Type: Improvement Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-1021) Update compact for code style and unne
chenerlu created CARBONDATA-1021: Summary: Update compact for code style and unne Key: CARBONDATA-1021 URL: https://issues.apache.org/jira/browse/CARBONDATA-1021 Project: CarbonData Issue Type: Improvement Reporter: chenerlu -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-1021) Update compact for code style and unnecessary operation
[ https://issues.apache.org/jira/browse/CARBONDATA-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu reassigned CARBONDATA-1021: Assignee: chenerlu Request participants: (was: ) Priority: Minor (was: Major) Summary: Update compact for code style and unnecessary operation (was: Update compact for code style and unne) > Update compact for code style and unnecessary operation > --- > > Key: CARBONDATA-1021 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1021 > Project: CarbonData > Issue Type: Improvement >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-987) Can not delete lock file when drop table
chenerlu created CARBONDATA-987: --- Summary: Can not delete lock file when drop table Key: CARBONDATA-987 URL: https://issues.apache.org/jira/browse/CARBONDATA-987 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-986) Add alter table example
chenerlu created CARBONDATA-986: --- Summary: Add alter table example Key: CARBONDATA-986 URL: https://issues.apache.org/jira/browse/CARBONDATA-986 Project: CarbonData Issue Type: Bug Reporter: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-964) How Carbon will behave when execute insert operation in abnormal scenarios?
chenerlu created CARBONDATA-964: --- Summary: How Carbon will behave when execute insert operation in abnormal scenarios? Key: CARBONDATA-964 URL: https://issues.apache.org/jira/browse/CARBONDATA-964 Project: CarbonData Issue Type: Bug Reporter: chenerlu Assignee: chenerlu Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CARBONDATA-954) The driverExecutorCacheConfTest failed because of interaction between testcases in CacheProviderTest
[ https://issues.apache.org/jira/browse/CARBONDATA-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu resolved CARBONDATA-954. - Resolution: Resolved > The driverExecutorCacheConfTest failed because of interaction between > testcases in CacheProviderTest > > > Key: CARBONDATA-954 > URL: https://issues.apache.org/jira/browse/CARBONDATA-954 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > > Problem: The driverExecutorCacheConfTest will fail when run all test cases in > CacheProviderTest, while just run driverExecutorCacheConfTest will success. > Solution: > The driverExecutorCacheConfTest will fail after run the second test case > (createCache), because CacheProvider.getInstance() will get the instance > which have created caches in second testcase(createCache).So suggest drop all > caches after assertion in second testcase(createCache). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (CARBONDATA-954) The driverExecutorCacheConfTest failed because of interaction between testcases in CacheProviderTest
[ https://issues.apache.org/jira/browse/CARBONDATA-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu reopened CARBONDATA-954: - > The driverExecutorCacheConfTest failed because of interaction between > testcases in CacheProviderTest > > > Key: CARBONDATA-954 > URL: https://issues.apache.org/jira/browse/CARBONDATA-954 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > > Problem: The driverExecutorCacheConfTest will fail when run all test cases in > CacheProviderTest, while just run driverExecutorCacheConfTest will success. > Solution: > The driverExecutorCacheConfTest will fail after run the second test case > (createCache), because CacheProvider.getInstance() will get the instance > which have created caches in second testcase(createCache).So suggest drop all > caches after assertion in second testcase(createCache). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CARBONDATA-954) The driverExecutorCacheConfTest failed because of interaction between testcases in CacheProviderTest
[ https://issues.apache.org/jira/browse/CARBONDATA-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chenerlu resolved CARBONDATA-954. - Resolution: Duplicate > The driverExecutorCacheConfTest failed because of interaction between > testcases in CacheProviderTest > > > Key: CARBONDATA-954 > URL: https://issues.apache.org/jira/browse/CARBONDATA-954 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > > Problem: The driverExecutorCacheConfTest will fail when run all test cases in > CacheProviderTest, while just run driverExecutorCacheConfTest will success. > Solution: > The driverExecutorCacheConfTest will fail after run the second test case > (createCache), because CacheProvider.getInstance() will get the instance > which have created caches in second testcase(createCache).So suggest drop all > caches after assertion in second testcase(createCache). -- This message was sent by Atlassian JIRA (v6.3.15#6346)