[GitHub] incubator-carbondata issue #525: [CARBONDATA-628] Fixed measure selection wi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/525 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/563/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (CARBONDATA-617) Insert query not working with UNION
[ https://issues.apache.org/jira/browse/CARBONDATA-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15820052#comment-15820052 ] QiangCai commented on CARBONDATA-617: - I am working for this issue > Insert query not working with UNION > --- > > Key: CARBONDATA-617 > URL: https://issues.apache.org/jira/browse/CARBONDATA-617 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.0.0-incubating > Environment: Spark 1.6 > Hadoop 2.6 >Reporter: Deepti Bhardwaj >Assignee: QiangCai >Priority: Minor > Attachments: 2000_UniqData.csv, > thrift-error-log-during-insert-with-union > > > I created 3 table all having same schema > Create table commands: > CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > CREATE TABLE student (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > CREATE TABLE department (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > and I loaded the uniqdata and department table with the attached > CSV(2000_UniqData.csv) > and the insert query used to load data in student table was: > insert into student select > CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1 > from uniqdata UNION select > CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1 > from department; > When I try to insert data into student with union operation, it gives > java.lang.Exception: DataLoad failure.(attached below) > The Union query works well when used alone but when insert is used with Union > it fails. > Also, if I used hive tables instead of carbon tables insert does not work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CARBONDATA-617) Insert query not working with UNION
[ https://issues.apache.org/jira/browse/CARBONDATA-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] QiangCai reassigned CARBONDATA-617: --- Assignee: QiangCai > Insert query not working with UNION > --- > > Key: CARBONDATA-617 > URL: https://issues.apache.org/jira/browse/CARBONDATA-617 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.0.0-incubating > Environment: Spark 1.6 > Hadoop 2.6 >Reporter: Deepti Bhardwaj >Assignee: QiangCai >Priority: Minor > Attachments: 2000_UniqData.csv, > thrift-error-log-during-insert-with-union > > > I created 3 table all having same schema > Create table commands: > CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > CREATE TABLE student (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > CREATE TABLE department (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format'; > and I loaded the uniqdata and department table with the attached > CSV(2000_UniqData.csv) > and the insert query used to load data in student table was: > insert into student select > CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1 > from uniqdata UNION select > CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1 > from department; > When I try to insert data into student with union operation, it gives > java.lang.Exception: DataLoad failure.(attached below) > The Union query works well when used alone but when insert is used with Union > it fails. > Also, if I used hive tables instead of carbon tables insert does not work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"
[ https://issues.apache.org/jira/browse/CARBONDATA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] QiangCai reassigned CARBONDATA-626: --- Assignee: QiangCai > [Dataload] Dataloading is not working with delimiter set as "|" > --- > > Key: CARBONDATA-626 > URL: https://issues.apache.org/jira/browse/CARBONDATA-626 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.0.0-incubating > Environment: 3 node cluster >Reporter: SOURYAKANTA DWIVEDY >Assignee: QiangCai > > Description : Data loading fail with delimiter as "|" . > Steps: > > 1. Create table > > 2. Load data into table > Log :- > - > - create table DIM_TERMINAL > ( > ID int, > TAC String, > TER_BRAND_NAME String, > TER_MODEL_NAME String, > TER_MODENAME String, > TER_TYPE_ID String, > TER_TYPE_NAME_EN String, > TER_TYPE_NAME_CHN String, > TER_OSTYPE String, > TER_OS_TYPE_NAME String, > HSPASPEED String, > LTESPEED String, > VOLTE_FLAG String, > flag String > ) stored by 'org.apache.carbondata.format' TBLPROPERTIES > ('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); > - jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath > 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 > OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= > 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); > Error: java.lang.RuntimeException: Data loading failed. table not found: > default.dim_terminal1 (state=,code=0) > 0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath > 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL > OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= > 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); > Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, > could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; > (state=,code=0) > - csv raw details : > 103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"
[ https://issues.apache.org/jira/browse/CARBONDATA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819937#comment-15819937 ] QiangCai commented on CARBONDATA-626: - PR518 has fixed this issue. https://github.com/apache/incubator-carbondata/pull/518 > [Dataload] Dataloading is not working with delimiter set as "|" > --- > > Key: CARBONDATA-626 > URL: https://issues.apache.org/jira/browse/CARBONDATA-626 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.0.0-incubating > Environment: 3 node cluster >Reporter: SOURYAKANTA DWIVEDY > > Description : Data loading fail with delimiter as "|" . > Steps: > > 1. Create table > > 2. Load data into table > Log :- > - > - create table DIM_TERMINAL > ( > ID int, > TAC String, > TER_BRAND_NAME String, > TER_MODEL_NAME String, > TER_MODENAME String, > TER_TYPE_ID String, > TER_TYPE_NAME_EN String, > TER_TYPE_NAME_CHN String, > TER_OSTYPE String, > TER_OS_TYPE_NAME String, > HSPASPEED String, > LTESPEED String, > VOLTE_FLAG String, > flag String > ) stored by 'org.apache.carbondata.format' TBLPROPERTIES > ('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); > - jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath > 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 > OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= > 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); > Error: java.lang.RuntimeException: Data loading failed. table not found: > default.dim_terminal1 (state=,code=0) > 0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath > 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL > OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= > 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); > Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, > could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; > (state=,code=0) > - csv raw details : > 103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...
Github user QiangCai commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/524#discussion_r95714873 --- Diff: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/AllDataTypesTestCaseAggregate.scala --- @@ -59,21 +59,4 @@ class AllDataTypesTestCaseAggregate extends QueryTest with BeforeAndAfterAll { Seq(Row(15.8))) }) - test("CARBONDATA-60-union-defect")({ --- End diff -- Because the previous builder 559 added one test case, so the builder 560 has two deleted test case. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #522: Update carbondata description and clean .pd...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/522 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/562/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...
Github user jackylk commented on the issue: https://github.com/apache/incubator-carbondata/pull/523 I verified with `mvn clean verify -Pno-kettle -Pspark-1.6` but it failed in test case `insert from hive-sum expression` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #522: Update carbondata description and cl...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/522#discussion_r95709902 --- Diff: README.md --- @@ -19,10 +19,7 @@ -Apache CarbonData(incubating) is a new big data file format for faster -interactive query using advanced columnar storage, index, compression -and encoding techniques to improve computing efficiency, in turn it will -help speedup queries an order of magnitude faster over PetaBytes of data. +Apache CarbonData(incubating) is an indexed columnar data format for fast analytics on big data platform, e.g.Apache Hadoop, Apache Spark etc. --- End diff -- a `,` is missing before `etc` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95709745 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,51 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { --- End diff -- how about in carbon-spark2 module, can you check the same in that module also? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/523#discussion_r95704439 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -719,16 +720,51 @@ object CarbonDataRDDFactory { loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS) val rddIteratorKey = CarbonCommonConstants.RDDUTIL_UPDATE_KEY + UUID.randomUUID().toString + if (useKettle) { +try { + RddInpututilsForUpdate.put(rddIteratorKey, +new RddIteratorForUpdate(iter, carbonLoadModel)) + carbonLoadModel.setRddIteratorKey(rddIteratorKey) + CarbonDataLoadForUpdate +.run(carbonLoadModel, index, storePath, kettleHomePath, + segId, loadMetadataDetails, executionErrors) +} finally { + RddInpututilsForUpdate.remove(rddIteratorKey) +} + } + else { --- End diff -- move to previous line --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/524#discussion_r95704043 --- Diff: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/AllDataTypesTestCaseAggregate.scala --- @@ -59,21 +59,4 @@ class AllDataTypesTestCaseAggregate extends QueryTest with BeforeAndAfterAll { Seq(Row(15.8))) }) - test("CARBONDATA-60-union-defect")({ --- End diff -- here, only one test case is removed from carbon-spark module, but in test report, it says two are deleted, can you check why? http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/560/testReport/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #525: [CARBONDATA-628] Fixed measure selec...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/525#discussion_r95702463 --- Diff: core/src/main/java/org/apache/carbondata/scan/processor/AbstractDataBlockIterator.java --- @@ -85,12 +90,15 @@ public AbstractDataBlockIterator(BlockExecutionInfo blockExecutionInfo, blockletScanner = new NonFilterScanner(blockExecutionInfo, queryStatisticsModel); } if (blockExecutionInfo.isRawRecordDetailQuery()) { + LOGGER.audit("Row based raw collector is used to scan and collect the data"); --- End diff -- Should it be audit or info? Audit is used for keeping operational log, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #511: [CARBONDATA-584]added validation for...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/511#discussion_r95702215 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSource.scala --- @@ -108,6 +111,9 @@ class CarbonSource extends CreatableRelationProvider val dbName: String = parameters.getOrElse("dbName", CarbonCommonConstants.DATABASE_DEFAULT_NAME) val tableName: String = parameters.getOrElse("tableName", "default_table") +if(tableName.isEmpty || tableName.contains("")) { --- End diff -- I think you can use `StringUtils.isBlank` utility function --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #525: [CARBONDATA-628] Fixed measure selection wi...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/525 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/561/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-628) Issue when measure selection with out table order gives wrong result with vectorized reader enabled
[ https://issues.apache.org/jira/browse/CARBONDATA-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala updated CARBONDATA-628: --- Affects Version/s: 1.0.0-incubating > Issue when measure selection with out table order gives wrong result with > vectorized reader enabled > --- > > Key: CARBONDATA-628 > URL: https://issues.apache.org/jira/browse/CARBONDATA-628 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.0.0-incubating >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala >Priority: Minor > > If the table is created with measure order like m1, m2 and user selects the > measures m2, m1 then it returns wrong result with vectorized reader enabled -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CARBONDATA-628) Issue when measure selection with out table order gives wrong result with vectorized reader enabled
Ravindra Pesala created CARBONDATA-628: -- Summary: Issue when measure selection with out table order gives wrong result with vectorized reader enabled Key: CARBONDATA-628 URL: https://issues.apache.org/jira/browse/CARBONDATA-628 Project: CarbonData Issue Type: Bug Reporter: Ravindra Pesala Assignee: Ravindra Pesala Priority: Minor If the table is created with measure order like m1, m2 and user selects the measures m2, m1 then it returns wrong result with vectorized reader enabled -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata pull request #525: Fixed measure selection with out tab...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/525 Fixed measure selection with out table order gives wrong result with vectorized reader enabled If the table is created with measure order like m1, m2 and user selects the measures m2, m1 then it returns wrong result with vectorized reader enabled You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata spark1.6-compilationissue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/525.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #525 commit 50ee3ecf1f2a6e496d052ee95b9334522574a824 Author: ravipesala Date: 2017-01-11T17:17:42Z Fixed measure selection with out table order gives wrong result --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #524: [CARBONDATA-627]fix union test case for spa...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/524 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/560/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...
GitHub user QiangCai opened a pull request: https://github.com/apache/incubator-carbondata/pull/524 [CARBONDATA-627]fix union test case for spark2 Analyze: Union test case failed in spark2. The result of union query is twice of the result of left query. Root Cause: CarbonLateDecodeRule only use union.children.head plan to build all CarbonDictionaryTempDecoder. Changes: Use child plan to build each CarbonDictionaryTempDecoder. You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/incubator-carbondata fixUnionTestCase Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/524.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #524 commit 0abc4f8f1fe6cfe0e8fe8842f7b7ba40f1e191a7 Author: QiangCai Date: 2017-01-11T15:47:25Z fixUnionTestCase --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-627) Fix Union unit test case for spark2
QiangCai created CARBONDATA-627: --- Summary: Fix Union unit test case for spark2 Key: CARBONDATA-627 URL: https://issues.apache.org/jira/browse/CARBONDATA-627 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.0.0-incubating Reporter: QiangCai Assignee: QiangCai Priority: Minor Fix For: 1.0.0-incubating UnionTestCase failed in spark2, We should fix it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CARBONDATA-623) If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with si
[ https://issues.apache.org/jira/browse/CARBONDATA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818650#comment-15818650 ] Babulal commented on CARBONDATA-623: Hi , can you please refer CARBONDATA-595 Drop Table for carbon throws NPE seems it it a same issue. Thanks Babu > If we drop table after this condition ---(Firstly we load data in table with > single pass true and use kettle false and then in same table load data 2nd > time with single pass true and use kettle false ), it is throwing Error: > java.lang.NullPointerException > --- > > Key: CARBONDATA-623 > URL: https://issues.apache.org/jira/browse/CARBONDATA-623 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.0.0-incubating >Reporter: Payal >Priority: Minor > Attachments: 7000_UniqData.csv > > > 1.Firstly we load data in table with single pass true and use kettle false > data load successfully and we are getting result set properly. > 2.then in same table load data in table with single pass true and use kettle > false data load successfully and we are getting result set properly. > 3.But after that if we drop the table ,its is throwing null pointer exception. > Queries > 0: jdbc:hive2://hadoop-master:1> CREATE TABLE uniqdata_INCLUDEDICTIONARY > (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, > Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY > 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.13 seconds) > 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH > 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table > uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', > 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='false','USE_KETTLE' > ='false'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (22.814 seconds) > 0: jdbc:hive2://hadoop-master:1> > 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > +---+--+ > | _c0 | > +---+--+ > | 7002 | > +---+--+ > 1 row selected (3.055 seconds) > 0: jdbc:hive2://hadoop-master:1> select count(CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > +---+--+ > | _c0 | > +---+--+ > | 7013 | > +---+--+ > 1 row selected (0.366 seconds) > 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH > 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table > uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', > 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE' > ='false'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (4.837 seconds) > 0: jdbc:hive2://hadoop-master:1> select count(CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > ++--+ > | _c0 | > ++--+ > | 14026 | > ++--+ > 1 row selected (0.458 seconds) > 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > +---+--+ > | _c0 | > +---+--+ > | 7002 | > +---+--+ > 1 row selected (3.173 seconds) > 0: jdbc:hive2://hadoop-master:1> drop table uniqdata_includedictionary; > Error: java.lang.NullPointerException (state=,code=0) > Logs > WARN 11-01 12:56:52,722 - Lost task 0.0 in stage 61.0 (TID 1740, > hadoop-slave-2): FetchFailed(BlockManagerId(0, hadoop-slave-3, 45331), > shuffleId=22, mapId=0, reduceId=0, message= > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > hadoop-slave-3:45331 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323) > at > org.apache.spark.storage.ShuffleBlockFetche
[GitHub] incubator-carbondata pull request #520: fix dependency issue for IntelliJ ID...
Github user QiangCai closed the pull request at: https://github.com/apache/incubator-carbondata/pull/520 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #520: fix dependency issue for IntelliJ IDEA
Github user QiangCai commented on the issue: https://github.com/apache/incubator-carbondata/pull/520 close this pr. I didn't reproduce this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"
SOURYAKANTA DWIVEDY created CARBONDATA-626: -- Summary: [Dataload] Dataloading is not working with delimiter set as "|" Key: CARBONDATA-626 URL: https://issues.apache.org/jira/browse/CARBONDATA-626 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.0.0-incubating Environment: 3 node cluster Reporter: SOURYAKANTA DWIVEDY Description : Data loading fail with delimiter as "|" . Steps: > 1. Create table > 2. Load data into table Log :- - - create table DIM_TERMINAL ( ID int, TAC String, TER_BRAND_NAME String, TER_MODEL_NAME String, TER_MODENAME String, TER_TYPE_ID String, TER_TYPE_NAME_EN String, TER_TYPE_NAME_CHN String, TER_OSTYPE String, TER_OS_TYPE_NAME String, HSPASPEED String, LTESPEED String, VOLTE_FLAG String, flag String ) stored by 'org.apache.carbondata.format' TBLPROPERTIES ('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); - jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); Error: java.lang.RuntimeException: Data loading failed. table not found: default.dim_terminal1 (state=,code=0) 0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag'); Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; (state=,code=0) - csv raw details : 103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/523 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/559/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #522: Update carbondata description and clean .pd...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/522 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/558/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...
GitHub user ravikiran23 opened a pull request: https://github.com/apache/incubator-carbondata/pull/523 [CARBONDATA-440] fixing no kettle issue for IUD. For iud data load flow will be used. so in the case of NO-KETTLE, need to handle data load. load count/ segment count should be string because in compaction case it will be 2.1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravikiran23/incubator-carbondata IUD-NO-KETTLE Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/523.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #523 commit 5dd98b38e332b08f11daeaa683950b90172e02a9 Author: ravikiran Date: 2017-01-09T13:28:13Z fixing no kettle issue for IUD. load count/ segment count should be string because in compaction case it will be 2.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.
[ https://issues.apache.org/jira/browse/CARBONDATA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818459#comment-15818459 ] Liang Chen commented on CARBONDATA-624: --- OK, thank you start this work. One thing please notice : Please only put .md files to github, don't suggest adding other kind of files to github, like pdf,text and so on. > Complete CarbonData document to be present in git and the same needs to sync > with the carbondata.apace.org and for further updates. > --- > > Key: CARBONDATA-624 > URL: https://issues.apache.org/jira/browse/CARBONDATA-624 > Project: CarbonData > Issue Type: Improvement >Reporter: Gururaj Shetty >Assignee: Gururaj Shetty > > The information about CarbonData is there is git and cwiki. So we have to > merge all the information and create the markdown files for each topic about > CarbonData. > This markdown files will be having the complete information about CarbonData > like Overview, Installation, Configuration, DDL, DML, Use case and so on. > Also these markdown information will be sync to the website documentation - > carbondata.apace.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata pull request #522: Update carbondata description and cl...
GitHub user chenliang613 opened a pull request: https://github.com/apache/incubator-carbondata/pull/522 Update carbondata description and clean .pdf files 1.Update CarbonData description, to keep consistent with apache.org 2.Clean .pdf files in github, the meetup material will be put in cwiki. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenliang613/incubator-carbondata carbon_desc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/522.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #522 commit 363000abe1b8383586e0f2bf02f6df0f6c8bbb51 Author: chenliang613 Date: 2017-01-11T14:12:13Z update carbon description and clean .pdf files --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #519: Update description,keep consistent with apa...
Github user chenliang613 commented on the issue: https://github.com/apache/incubator-carbondata/pull/519 Close this PR, create a new PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #519: Update description,keep consistent w...
Github user chenliang613 closed the pull request at: https://github.com/apache/incubator-carbondata/pull/519 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-625) Abnormal behaviour of Int datatype
Geetika Gupta created CARBONDATA-625: Summary: Abnormal behaviour of Int datatype Key: CARBONDATA-625 URL: https://issues.apache.org/jira/browse/CARBONDATA-625 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 1.0.0-incubating Environment: Spark: 1.6 and hadoop: 2.6.5 Reporter: Geetika Gupta Priority: Minor Attachments: Screenshot from 2017-01-11 18-36-24.png, testMaxValueForBigInt.csv I was trying to create a table having int as a column and loaded data into the table. Data loading was performed successfully but when I viewed the data of the table, there was some wrong data present in the table. I was trying to load BigInt data to an int column. All the data in int column is loaded with the first value of the csv. Below are the details for the queries: create table xyz(a int, b string)stored by 'carbondata'; Data load query: LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/testMaxValueForBigInt.csv' into table xyz OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='a,b'); select query: select * from xyz; PFA the screenshot of the output and the csv file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata issue #521: [CARBONDATA-390] Support for float datatype
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/521 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/557/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #521: [CARBONDATA-390] Support for float datatype
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/521 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/556/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #521: [CARBONDATA-390] Support for float datatype
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/521 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/555/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #521: [CARBONDATA-390] Support for float d...
GitHub user phalodi opened a pull request: https://github.com/apache/incubator-carbondata/pull/521 [CARBONDATA-390] Support for float datatype - Support the float dataype in carbon file format - Run all unit test cases and sucess build with 1.6 and 2.1 - Run style checks to remove checks errors. - Make changes in Example for float datatype in spark 1.6 and 2.1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/phalodi/incubator-carbondata CARBONDATA-390 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/521.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #521 commit c44399b390b6bb79e7933319101e51812a4b7817 Author: sandy Date: 2017-01-11T09:37:38Z support for float datatype --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.
[ https://issues.apache.org/jira/browse/CARBONDATA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gururaj Shetty reassigned CARBONDATA-624: - Assignee: Gururaj Shetty > Complete CarbonData document to be present in git and the same needs to sync > with the carbondata.apace.org and for further updates. > --- > > Key: CARBONDATA-624 > URL: https://issues.apache.org/jira/browse/CARBONDATA-624 > Project: CarbonData > Issue Type: Improvement >Reporter: Gururaj Shetty >Assignee: Gururaj Shetty > > The information about CarbonData is there is git and cwiki. So we have to > merge all the information and create the markdown files for each topic about > CarbonData. > This markdown files will be having the complete information about CarbonData > like Overview, Installation, Configuration, DDL, DML, Use case and so on. > Also these markdown information will be sync to the website documentation - > carbondata.apace.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.
Gururaj Shetty created CARBONDATA-624: - Summary: Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates. Key: CARBONDATA-624 URL: https://issues.apache.org/jira/browse/CARBONDATA-624 Project: CarbonData Issue Type: Improvement Reporter: Gururaj Shetty The information about CarbonData is there is git and cwiki. So we have to merge all the information and create the markdown files for each topic about CarbonData. This markdown files will be having the complete information about CarbonData like Overview, Installation, Configuration, DDL, DML, Use case and so on. Also these markdown information will be sync to the website documentation - carbondata.apace.org -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata pull request #519: Update description,keep consistent w...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/519#discussion_r95543961 --- Diff: README.md --- @@ -19,10 +19,7 @@ -Apache CarbonData(incubating) is a new big data file format for faster -interactive query using advanced columnar storage, index, compression -and encoding techniques to improve computing efficiency, in turn it will -help speedup queries an order of magnitude faster over PetaBytes of data. +Apache CarbonData(incubating) is an indexed columnar data format for faster analytics on big data platform like Apache Hadoop, Apache Spark and so on. --- End diff -- please modify the description in pom.xml also --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #518: [CARBONDATA-622]unify file header re...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/518 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-622) Should use the same fileheader reader for dict generation and data loading
[ https://issues.apache.org/jira/browse/CARBONDATA-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-622. - Resolution: Fixed > Should use the same fileheader reader for dict generation and data loading > -- > > Key: CARBONDATA-622 > URL: https://issues.apache.org/jira/browse/CARBONDATA-622 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.0.0-incubating >Reporter: QiangCai >Assignee: QiangCai >Priority: Minor > Fix For: 1.0.0-incubating > > Time Spent: 3h > Remaining Estimate: 0h > > We can get file header from DDL command and CSV file. > 1. If the file header comes from DDL command, separate this file header by > comma "," > 2. if the file header comes from CSV file, sparate this file header by > specify delimiter in DDL command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] incubator-carbondata issue #518: [CARBONDATA-622]unify file header reader
Github user jackylk commented on the issue: https://github.com/apache/incubator-carbondata/pull/518 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #519: Update description,keep consistent with apa...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/519 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/554/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #519: Update description,keep consistent with apa...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/519 Build Success with Spark 1.5.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/553/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #520: fix dependency issue for IntelliJ ID...
GitHub user QiangCai opened a pull request: https://github.com/apache/incubator-carbondata/pull/520 fix dependency issue for IntelliJ IDEA When using profile spark-2.1, can not run test case of spark-common-test in IntelliJ IDEA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/QiangCai/incubator-carbondata fixIdeaMavenIssue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/520.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #520 commit 74d4bf8933540348525a16fdba361e780fe0f494 Author: QiangCai Date: 2017-01-11T08:37:24Z fix dependency issue for IntelliJ IDEA --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-623) If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with sing
[ https://issues.apache.org/jira/browse/CARBONDATA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Payal updated CARBONDATA-623: - Priority: Minor (was: Major) Affects Version/s: 1.0.0-incubating Attachment: 7000_UniqData.csv > If we drop table after this condition ---(Firstly we load data in table with > single pass true and use kettle false and then in same table load data 2nd > time with single pass true and use kettle false ), it is throwing Error: > java.lang.NullPointerException > --- > > Key: CARBONDATA-623 > URL: https://issues.apache.org/jira/browse/CARBONDATA-623 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.0.0-incubating >Reporter: Payal >Priority: Minor > Attachments: 7000_UniqData.csv > > > 1.Firstly we load data in table with single pass true and use kettle false > data load successfully and we are getting result set properly. > 2.then in same table load data in table with single pass true and use kettle > false data load successfully and we are getting result set properly. > 3.But after that if we drop the table ,its is throwing null pointer exception. > Queries > 0: jdbc:hive2://hadoop-master:1> CREATE TABLE uniqdata_INCLUDEDICTIONARY > (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, > Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY > 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.13 seconds) > 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH > 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table > uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', > 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='false','USE_KETTLE' > ='false'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (22.814 seconds) > 0: jdbc:hive2://hadoop-master:1> > 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > +---+--+ > | _c0 | > +---+--+ > | 7002 | > +---+--+ > 1 row selected (3.055 seconds) > 0: jdbc:hive2://hadoop-master:1> select count(CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > +---+--+ > | _c0 | > +---+--+ > | 7013 | > +---+--+ > 1 row selected (0.366 seconds) > 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH > 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table > uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', > 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE' > ='false'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (4.837 seconds) > 0: jdbc:hive2://hadoop-master:1> select count(CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > ++--+ > | _c0 | > ++--+ > | 14026 | > ++--+ > 1 row selected (0.458 seconds) > 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from > uniqdata_INCLUDEDICTIONARY ; > +---+--+ > | _c0 | > +---+--+ > | 7002 | > +---+--+ > 1 row selected (3.173 seconds) > 0: jdbc:hive2://hadoop-master:1> drop table uniqdata_includedictionary; > Error: java.lang.NullPointerException (state=,code=0) > Logs > WARN 11-01 12:56:52,722 - Lost task 0.0 in stage 61.0 (TID 1740, > hadoop-slave-2): FetchFailed(BlockManagerId(0, hadoop-slave-3, 45331), > shuffleId=22, mapId=0, reduceId=0, message= > org.apache.spark.shuffle.FetchFailedException: Failed to connect to > hadoop-slave-3:45331 > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323) > at > org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300)
[jira] [Created] (CARBONDATA-623) If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with sing
Payal created CARBONDATA-623: Summary: If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with single pass true and use kettle false ), it is throwing Error: java.lang.NullPointerException Key: CARBONDATA-623 URL: https://issues.apache.org/jira/browse/CARBONDATA-623 Project: CarbonData Issue Type: Bug Components: data-load Reporter: Payal 1.Firstly we load data in table with single pass true and use kettle false data load successfully and we are getting result set properly. 2.then in same table load data in table with single pass true and use kettle false data load successfully and we are getting result set properly. 3.But after that if we drop the table ,its is throwing null pointer exception. Queries 0: jdbc:hive2://hadoop-master:1> CREATE TABLE uniqdata_INCLUDEDICTIONARY (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (1.13 seconds) 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='false','USE_KETTLE' ='false'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (22.814 seconds) 0: jdbc:hive2://hadoop-master:1> 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from uniqdata_INCLUDEDICTIONARY ; +---+--+ | _c0 | +---+--+ | 7002 | +---+--+ 1 row selected (3.055 seconds) 0: jdbc:hive2://hadoop-master:1> select count(CUST_NAME) from uniqdata_INCLUDEDICTIONARY ; +---+--+ | _c0 | +---+--+ | 7013 | +---+--+ 1 row selected (0.366 seconds) 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE' ='false'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (4.837 seconds) 0: jdbc:hive2://hadoop-master:1> select count(CUST_NAME) from uniqdata_INCLUDEDICTIONARY ; ++--+ | _c0 | ++--+ | 14026 | ++--+ 1 row selected (0.458 seconds) 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from uniqdata_INCLUDEDICTIONARY ; +---+--+ | _c0 | +---+--+ | 7002 | +---+--+ 1 row selected (3.173 seconds) 0: jdbc:hive2://hadoop-master:1> drop table uniqdata_includedictionary; Error: java.lang.NullPointerException (state=,code=0) Logs WARN 11-01 12:56:52,722 - Lost task 0.0 in stage 61.0 (TID 1740, hadoop-slave-2): FetchFailed(BlockManagerId(0, hadoop-slave-3, 45331), shuffleId=22, mapId=0, reduceId=0, message= org.apache.spark.shuffle.FetchFailedException: Failed to connect to hadoop-slave-3:45331 at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:51) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:504) at org.apache.spark.sql.execution.aggregate.Tungs