[GitHub] incubator-carbondata issue #525: [CARBONDATA-628] Fixed measure selection wi...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/525
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/563/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (CARBONDATA-617) Insert query not working with UNION

2017-01-11 Thread QiangCai (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15820052#comment-15820052
 ] 

QiangCai commented on CARBONDATA-617:
-

I am working for this issue

> Insert query not working with UNION
> ---
>
> Key: CARBONDATA-617
> URL: https://issues.apache.org/jira/browse/CARBONDATA-617
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
> Hadoop 2.6
>Reporter: Deepti Bhardwaj
>Assignee: QiangCai
>Priority: Minor
> Attachments: 2000_UniqData.csv, 
> thrift-error-log-during-insert-with-union
>
>
> I created 3 table all having same schema
> Create table commands:
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> CREATE TABLE student (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> CREATE TABLE department (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> and I loaded the uniqdata and department table with the attached 
> CSV(2000_UniqData.csv)
> and the insert query used to load data in student table was:
> insert into student select 
> CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1
>  from uniqdata UNION select 
> CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1
>  from department;
> When I try to insert data into student with union operation, it gives 
> java.lang.Exception: DataLoad failure.(attached below)
> The Union query works well when used alone but when insert is used with Union 
> it fails.
> Also, if I used hive tables instead of carbon tables insert does not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CARBONDATA-617) Insert query not working with UNION

2017-01-11 Thread QiangCai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QiangCai reassigned CARBONDATA-617:
---

Assignee: QiangCai

> Insert query not working with UNION
> ---
>
> Key: CARBONDATA-617
> URL: https://issues.apache.org/jira/browse/CARBONDATA-617
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
> Hadoop 2.6
>Reporter: Deepti Bhardwaj
>Assignee: QiangCai
>Priority: Minor
> Attachments: 2000_UniqData.csv, 
> thrift-error-log-during-insert-with-union
>
>
> I created 3 table all having same schema
> Create table commands:
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> CREATE TABLE student (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> CREATE TABLE department (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> and I loaded the uniqdata and department table with the attached 
> CSV(2000_UniqData.csv)
> and the insert query used to load data in student table was:
> insert into student select 
> CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1
>  from uniqdata UNION select 
> CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1
>  from department;
> When I try to insert data into student with union operation, it gives 
> java.lang.Exception: DataLoad failure.(attached below)
> The Union query works well when used alone but when insert is used with Union 
> it fails.
> Also, if I used hive tables instead of carbon tables insert does not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"

2017-01-11 Thread QiangCai (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

QiangCai reassigned CARBONDATA-626:
---

Assignee: QiangCai

> [Dataload] Dataloading is not working with delimiter set as "|"
> ---
>
> Key: CARBONDATA-626
> URL: https://issues.apache.org/jira/browse/CARBONDATA-626
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: 3 node cluster
>Reporter: SOURYAKANTA DWIVEDY
>Assignee: QiangCai
>
> Description : Data loading fail with delimiter as "|" .
> Steps:
> > 1. Create table
> > 2. Load data into table
> Log :-
> -
> - create table DIM_TERMINAL 
> (
> ID int,
> TAC String,
> TER_BRAND_NAME String,
> TER_MODEL_NAME String,
> TER_MODENAME String,
> TER_TYPE_ID String,
> TER_TYPE_NAME_EN String,
> TER_TYPE_NAME_CHN String,
> TER_OSTYPE String,
> TER_OS_TYPE_NAME String,
> HSPASPEED String,
> LTESPEED String,
> VOLTE_FLAG String,
> flag String
> ) stored by 'org.apache.carbondata.format' TBLPROPERTIES 
> ('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> - jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
> 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 
> OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
> 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> Error: java.lang.RuntimeException: Data loading failed. table not found: 
> default.dim_terminal1 (state=,code=0)
> 0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
> 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL 
> OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
> 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, 
> could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; 
> (state=,code=0)
> - csv raw details :  
> 103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"

2017-01-11 Thread QiangCai (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15819937#comment-15819937
 ] 

QiangCai commented on CARBONDATA-626:
-

PR518 has fixed this issue.
https://github.com/apache/incubator-carbondata/pull/518

> [Dataload] Dataloading is not working with delimiter set as "|"
> ---
>
> Key: CARBONDATA-626
> URL: https://issues.apache.org/jira/browse/CARBONDATA-626
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
> Environment: 3 node cluster
>Reporter: SOURYAKANTA DWIVEDY
>
> Description : Data loading fail with delimiter as "|" .
> Steps:
> > 1. Create table
> > 2. Load data into table
> Log :-
> -
> - create table DIM_TERMINAL 
> (
> ID int,
> TAC String,
> TER_BRAND_NAME String,
> TER_MODEL_NAME String,
> TER_MODENAME String,
> TER_TYPE_ID String,
> TER_TYPE_NAME_EN String,
> TER_TYPE_NAME_CHN String,
> TER_OSTYPE String,
> TER_OS_TYPE_NAME String,
> HSPASPEED String,
> LTESPEED String,
> VOLTE_FLAG String,
> flag String
> ) stored by 'org.apache.carbondata.format' TBLPROPERTIES 
> ('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> - jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
> 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 
> OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
> 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> Error: java.lang.RuntimeException: Data loading failed. table not found: 
> default.dim_terminal1 (state=,code=0)
> 0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
> 'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL 
> OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
> 'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
> Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, 
> could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; 
> (state=,code=0)
> - csv raw details :  
> 103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...

2017-01-11 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/524#discussion_r95714873
  
--- Diff: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/AllDataTypesTestCaseAggregate.scala
 ---
@@ -59,21 +59,4 @@ class AllDataTypesTestCaseAggregate extends QueryTest 
with BeforeAndAfterAll {
   Seq(Row(15.8)))
   })
 
-  test("CARBONDATA-60-union-defect")({
--- End diff --

Because the previous builder 559 added one test case, so the builder 560 
has two deleted test case. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #522: Update carbondata description and clean .pd...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/522
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/562/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

2017-01-11 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/523
  
I verified with `mvn clean verify -Pno-kettle -Pspark-1.6` but it failed in 
test case `insert from hive-sum expression`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #522: Update carbondata description and cl...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/522#discussion_r95709902
  
--- Diff: README.md ---
@@ -19,10 +19,7 @@
 
 
 
-Apache CarbonData(incubating) is a new big data file format for faster
-interactive query using advanced columnar storage, index, compression
-and encoding techniques to improve computing efficiency, in turn it will 
-help speedup queries an order of magnitude faster over PetaBytes of data. 
+Apache CarbonData(incubating) is an indexed columnar data format for fast 
analytics on big data platform, e.g.Apache Hadoop, Apache Spark etc.
--- End diff --

a `,` is missing before `etc`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/523#discussion_r95709745
  
--- Diff: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -719,16 +720,51 @@ object CarbonDataRDDFactory {
   
loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
   val rddIteratorKey = 
CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
UUID.randomUUID().toString
+  if (useKettle) {
--- End diff --

how about in carbon-spark2 module, can you check the same in that module 
also?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/523#discussion_r95704439
  
--- Diff: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -719,16 +720,51 @@ object CarbonDataRDDFactory {
   
loadMetadataDetails.setLoadStatus(CarbonCommonConstants.STORE_LOADSTATUS_SUCCESS)
   val rddIteratorKey = 
CarbonCommonConstants.RDDUTIL_UPDATE_KEY +
UUID.randomUUID().toString
+  if (useKettle) {
+try {
+  RddInpututilsForUpdate.put(rddIteratorKey,
+new RddIteratorForUpdate(iter, carbonLoadModel))
+  carbonLoadModel.setRddIteratorKey(rddIteratorKey)
+  CarbonDataLoadForUpdate
+.run(carbonLoadModel, index, storePath, kettleHomePath,
+  segId, loadMetadataDetails, executionErrors)
+} finally {
+  RddInpututilsForUpdate.remove(rddIteratorKey)
+}
+  }
+  else {
--- End diff --

move to previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/524#discussion_r95704043
  
--- Diff: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/AllDataTypesTestCaseAggregate.scala
 ---
@@ -59,21 +59,4 @@ class AllDataTypesTestCaseAggregate extends QueryTest 
with BeforeAndAfterAll {
   Seq(Row(15.8)))
   })
 
-  test("CARBONDATA-60-union-defect")({
--- End diff --

here, only one test case is removed from carbon-spark module, but in test 
report, it says two are deleted, can you check why?
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/560/testReport/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #525: [CARBONDATA-628] Fixed measure selec...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/525#discussion_r95702463
  
--- Diff: 
core/src/main/java/org/apache/carbondata/scan/processor/AbstractDataBlockIterator.java
 ---
@@ -85,12 +90,15 @@ public AbstractDataBlockIterator(BlockExecutionInfo 
blockExecutionInfo,
   blockletScanner = new NonFilterScanner(blockExecutionInfo, 
queryStatisticsModel);
 }
 if (blockExecutionInfo.isRawRecordDetailQuery()) {
+  LOGGER.audit("Row based raw collector is used to scan and collect 
the data");
--- End diff --

Should it be audit or info? Audit is used for keeping operational log, 
right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #511: [CARBONDATA-584]added validation for...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/511#discussion_r95702215
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSource.scala ---
@@ -108,6 +111,9 @@ class CarbonSource extends CreatableRelationProvider
 
 val dbName: String = parameters.getOrElse("dbName", 
CarbonCommonConstants.DATABASE_DEFAULT_NAME)
 val tableName: String = parameters.getOrElse("tableName", 
"default_table")
+if(tableName.isEmpty || tableName.contains("")) {
--- End diff --

I think you can use `StringUtils.isBlank` utility function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #525: [CARBONDATA-628] Fixed measure selection wi...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/525
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/561/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (CARBONDATA-628) Issue when measure selection with out table order gives wrong result with vectorized reader enabled

2017-01-11 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-628:
---
Affects Version/s: 1.0.0-incubating

> Issue when measure selection with out table order gives wrong result with 
> vectorized reader enabled
> ---
>
> Key: CARBONDATA-628
> URL: https://issues.apache.org/jira/browse/CARBONDATA-628
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Minor
>
> If the table is created with measure order like m1, m2 and user selects the 
> measures m2, m1 then it returns wrong result with vectorized reader enabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-628) Issue when measure selection with out table order gives wrong result with vectorized reader enabled

2017-01-11 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-628:
--

 Summary: Issue when measure selection with out table order gives 
wrong result with vectorized reader enabled
 Key: CARBONDATA-628
 URL: https://issues.apache.org/jira/browse/CARBONDATA-628
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala
Assignee: Ravindra Pesala
Priority: Minor


If the table is created with measure order like m1, m2 and user selects the 
measures m2, m1 then it returns wrong result with vectorized reader enabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #525: Fixed measure selection with out tab...

2017-01-11 Thread ravipesala
GitHub user ravipesala opened a pull request:

https://github.com/apache/incubator-carbondata/pull/525

Fixed measure selection with out table order gives wrong result with 
vectorized reader enabled

If the table is created with measure order like m1, m2 and user selects the 
measures m2, m1 then it returns wrong result with vectorized reader enabled

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravipesala/incubator-carbondata 
spark1.6-compilationissue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/525.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #525


commit 50ee3ecf1f2a6e496d052ee95b9334522574a824
Author: ravipesala 
Date:   2017-01-11T17:17:42Z

Fixed measure selection with out table order gives wrong result




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #524: [CARBONDATA-627]fix union test case for spa...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/524
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/560/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...

2017-01-11 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/524

 [CARBONDATA-627]fix union test case for spark2

Analyze:
Union test case failed in spark2. The result of union query is twice of the 
result of left query.

Root Cause:
CarbonLateDecodeRule only use union.children.head plan to build all 
CarbonDictionaryTempDecoder.

Changes:
Use child plan to build each CarbonDictionaryTempDecoder.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata fixUnionTestCase

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #524


commit 0abc4f8f1fe6cfe0e8fe8842f7b7ba40f1e191a7
Author: QiangCai 
Date:   2017-01-11T15:47:25Z

fixUnionTestCase




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-627) Fix Union unit test case for spark2

2017-01-11 Thread QiangCai (JIRA)
QiangCai created CARBONDATA-627:
---

 Summary: Fix Union unit test case for spark2
 Key: CARBONDATA-627
 URL: https://issues.apache.org/jira/browse/CARBONDATA-627
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 1.0.0-incubating
Reporter: QiangCai
Assignee: QiangCai
Priority: Minor
 Fix For: 1.0.0-incubating


UnionTestCase failed in spark2, We should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CARBONDATA-623) If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with si

2017-01-11 Thread Babulal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818650#comment-15818650
 ] 

Babulal commented on CARBONDATA-623:


Hi ,
can you  please refer CARBONDATA-595 Drop Table for carbon throws NPE
seems it it a same issue. 

Thanks
Babu

> If we drop table after this condition ---(Firstly we load data in table with 
> single pass true and use kettle false and then in same table load data 2nd 
> time with single pass true and use kettle false ), it is throwing Error: 
> java.lang.NullPointerException
> ---
>
> Key: CARBONDATA-623
> URL: https://issues.apache.org/jira/browse/CARBONDATA-623
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: Payal
>Priority: Minor
> Attachments: 7000_UniqData.csv
>
>
> 1.Firstly we load data in table with single pass true and use kettle false 
> data load successfully and  we are getting result set properly.
> 2.then in same table load data in table with single pass true and use kettle 
> false data load successfully and  we are getting result set properly.
> 3.But after that if we drop the table ,its is throwing null pointer exception.
> Queries
> 0: jdbc:hive2://hadoop-master:1> CREATE TABLE uniqdata_INCLUDEDICTIONARY 
> (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
> timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
> decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, 
> Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 
> 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.13 seconds)
> 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
> 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
> uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='false','USE_KETTLE'
>  ='false');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (22.814 seconds)
> 0: jdbc:hive2://hadoop-master:1> 
> 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> +---+--+
> |  _c0  |
> +---+--+
> | 7002  |
> +---+--+
> 1 row selected (3.055 seconds)
> 0: jdbc:hive2://hadoop-master:1> select  count(CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> +---+--+
> |  _c0  |
> +---+--+
> | 7013  |
> +---+--+
> 1 row selected (0.366 seconds)
> 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
> 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
> uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
>  ='false');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (4.837 seconds)
> 0: jdbc:hive2://hadoop-master:1> select  count(CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> ++--+
> |  _c0   |
> ++--+
> | 14026  |
> ++--+
> 1 row selected (0.458 seconds)
> 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> +---+--+
> |  _c0  |
> +---+--+
> | 7002  |
> +---+--+
> 1 row selected (3.173 seconds)
> 0: jdbc:hive2://hadoop-master:1> drop table uniqdata_includedictionary;
> Error: java.lang.NullPointerException (state=,code=0)
> Logs 
> WARN  11-01 12:56:52,722 - Lost task 0.0 in stage 61.0 (TID 1740, 
> hadoop-slave-2): FetchFailed(BlockManagerId(0, hadoop-slave-3, 45331), 
> shuffleId=22, mapId=0, reduceId=0, message=
> org.apache.spark.shuffle.FetchFailedException: Failed to connect to 
> hadoop-slave-3:45331
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323)
>   at 
> org.apache.spark.storage.ShuffleBlockFetche

[GitHub] incubator-carbondata pull request #520: fix dependency issue for IntelliJ ID...

2017-01-11 Thread QiangCai
Github user QiangCai closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/520


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #520: fix dependency issue for IntelliJ IDEA

2017-01-11 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/520
  
close this pr. I didn't reproduce this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-626) [Dataload] Dataloading is not working with delimiter set as "|"

2017-01-11 Thread SOURYAKANTA DWIVEDY (JIRA)
SOURYAKANTA DWIVEDY created CARBONDATA-626:
--

 Summary: [Dataload] Dataloading is not working with delimiter set 
as "|"
 Key: CARBONDATA-626
 URL: https://issues.apache.org/jira/browse/CARBONDATA-626
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.0.0-incubating
 Environment: 3 node cluster
Reporter: SOURYAKANTA DWIVEDY


Description : Data loading fail with delimiter as "|" .
Steps:
> 1. Create table
> 2. Load data into table

Log :-
-
- create table DIM_TERMINAL 
(
ID int,
TAC String,
TER_BRAND_NAME String,
TER_MODEL_NAME String,
TER_MODENAME String,
TER_TYPE_ID String,
TER_TYPE_NAME_EN String,
TER_TYPE_NAME_CHN String,
TER_OSTYPE String,
TER_OS_TYPE_NAME String,
HSPASPEED String,
LTESPEED String,
VOLTE_FLAG String,
flag String
) stored by 'org.apache.carbondata.format' TBLPROPERTIES 
('DICTIONARY_INCLUDE'='TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');

- jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL.csv' INTO table DIM_TERMINAL1 
OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
Error: java.lang.RuntimeException: Data loading failed. table not found: 
default.dim_terminal1 (state=,code=0)
0: jdbc:hive2://172.168.100.212:23040> LOAD DATA inpath 
'hdfs://hacluster/SEQIQ/IQ_DIM_TERMINAL1.csv' INTO table DIM_TERMINAL 
OPTIONS('DELIMITER'='|','USE_KETTLE'='false','QUOTECHAR'='','FILEHEADER'= 
'ID,TAC,TER_BRAND_NAME,TER_MODEL_NAME,TER_MODENAME,TER_TYPE_ID,TER_TYPE_NAME_EN,TER_TYPE_NAME_CHN,TER_OSTYPE,TER_OS_TYPE_NAME,HSPASPEED,LTESPEED,VOLTE_FLAG,flag');
Error: org.apache.spark.sql.AnalysisException: Reference 'D' is ambiguous, 
could be: D#4893, D#4907, D#4920, D#4935, D#4952, D#5025, D#5034.; 
(state=,code=0)

- csv raw details :  
103880|99000537|MI|2S H1SC 3C|2G/3G|0|SmartPhone|SmartPhone|4|Android|||1| 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #523: [CARBONDATA-440] fixing no kettle issue for...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/523
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/559/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #522: Update carbondata description and clean .pd...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/522
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/558/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #523: [CARBONDATA-440] fixing no kettle is...

2017-01-11 Thread ravikiran23
GitHub user ravikiran23 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/523

[CARBONDATA-440] fixing no kettle issue for IUD.

For iud data load flow will be used. so in the case of NO-KETTLE, need to 
handle data load.

load count/ segment count should be string because in compaction case it 
will be 2.1



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravikiran23/incubator-carbondata IUD-NO-KETTLE

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #523


commit 5dd98b38e332b08f11daeaa683950b90172e02a9
Author: ravikiran 
Date:   2017-01-09T13:28:13Z

fixing no kettle issue for IUD.
load count/ segment count should be string because in compaction case it 
will be 2.1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.

2017-01-11 Thread Liang Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818459#comment-15818459
 ] 

Liang Chen commented on CARBONDATA-624:
---

OK, thank you start this work.
One thing please notice : Please only put .md files to github, don't suggest 
adding other kind of files to github, like pdf,text and so on.

> Complete CarbonData document to be present in git and the same needs to sync 
> with the carbondata.apace.org and for further updates.
> ---
>
> Key: CARBONDATA-624
> URL: https://issues.apache.org/jira/browse/CARBONDATA-624
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Gururaj Shetty
>Assignee: Gururaj Shetty
>
> The information about CarbonData is there is git and cwiki. So we have to 
> merge all the information and create the markdown files for each topic about 
> CarbonData.
> This markdown files will be having the complete information about CarbonData 
> like Overview, Installation, Configuration, DDL, DML, Use case and so on.
> Also these markdown information will be sync to the website documentation - 
> carbondata.apace.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #522: Update carbondata description and cl...

2017-01-11 Thread chenliang613
GitHub user chenliang613 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/522

Update carbondata description and clean .pdf files

1.Update CarbonData description, to keep consistent with apache.org
2.Clean .pdf files in github,  the meetup material will be put in cwiki.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenliang613/incubator-carbondata carbon_desc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/522.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #522


commit 363000abe1b8383586e0f2bf02f6df0f6c8bbb51
Author: chenliang613 
Date:   2017-01-11T14:12:13Z

update carbon description and clean .pdf files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #519: Update description,keep consistent with apa...

2017-01-11 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/519
  
Close this PR, create a new PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #519: Update description,keep consistent w...

2017-01-11 Thread chenliang613
Github user chenliang613 closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/519


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-625) Abnormal behaviour of Int datatype

2017-01-11 Thread Geetika Gupta (JIRA)
Geetika Gupta created CARBONDATA-625:


 Summary: Abnormal behaviour of Int datatype
 Key: CARBONDATA-625
 URL: https://issues.apache.org/jira/browse/CARBONDATA-625
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 1.0.0-incubating
 Environment: Spark: 1.6  and hadoop: 2.6.5 
Reporter: Geetika Gupta
Priority: Minor
 Attachments: Screenshot from 2017-01-11 18-36-24.png, 
testMaxValueForBigInt.csv

I was trying to create a table having int as a column and loaded data into the 
table. Data loading was performed successfully but when I viewed the data of 
the table, there was some wrong data present in the table. I was trying to load 
BigInt data to an int column. All the data in int column is loaded with the 
first value of the csv. Below are the details for the queries:

create table xyz(a int, b string)stored by 'carbondata';

Data load query:
LOAD DATA INPATH 'hdfs://localhost:54311/testFiles/testMaxValueForBigInt.csv' 
into table xyz OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','FILEHEADER'='a,b');

select query:
select * from xyz;

PFA the screenshot of the output and the csv file.








--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #521: [CARBONDATA-390] Support for float datatype

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/521
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/557/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #521: [CARBONDATA-390] Support for float datatype

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/521
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/556/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #521: [CARBONDATA-390] Support for float datatype

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/521
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/555/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #521: [CARBONDATA-390] Support for float d...

2017-01-11 Thread phalodi
GitHub user phalodi opened a pull request:

https://github.com/apache/incubator-carbondata/pull/521

[CARBONDATA-390] Support for float datatype

- Support the float dataype in carbon file format
- Run all unit test cases and sucess build with 1.6 and 2.1
- Run style checks to remove checks errors.
- Make changes in Example for float datatype in spark 1.6 and 2.1

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/phalodi/incubator-carbondata CARBONDATA-390

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #521


commit c44399b390b6bb79e7933319101e51812a4b7817
Author: sandy 
Date:   2017-01-11T09:37:38Z

support for float datatype




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.

2017-01-11 Thread Gururaj Shetty (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gururaj Shetty reassigned CARBONDATA-624:
-

Assignee: Gururaj Shetty

> Complete CarbonData document to be present in git and the same needs to sync 
> with the carbondata.apace.org and for further updates.
> ---
>
> Key: CARBONDATA-624
> URL: https://issues.apache.org/jira/browse/CARBONDATA-624
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Gururaj Shetty
>Assignee: Gururaj Shetty
>
> The information about CarbonData is there is git and cwiki. So we have to 
> merge all the information and create the markdown files for each topic about 
> CarbonData.
> This markdown files will be having the complete information about CarbonData 
> like Overview, Installation, Configuration, DDL, DML, Use case and so on.
> Also these markdown information will be sync to the website documentation - 
> carbondata.apace.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-624) Complete CarbonData document to be present in git and the same needs to sync with the carbondata.apace.org and for further updates.

2017-01-11 Thread Gururaj Shetty (JIRA)
Gururaj Shetty created CARBONDATA-624:
-

 Summary: Complete CarbonData document to be present in git and the 
same needs to sync with the carbondata.apace.org and for further updates.
 Key: CARBONDATA-624
 URL: https://issues.apache.org/jira/browse/CARBONDATA-624
 Project: CarbonData
  Issue Type: Improvement
Reporter: Gururaj Shetty


The information about CarbonData is there is git and cwiki. So we have to merge 
all the information and create the markdown files for each topic about 
CarbonData.
This markdown files will be having the complete information about CarbonData 
like Overview, Installation, Configuration, DDL, DML, Use case and so on.
Also these markdown information will be sync to the website documentation - 
carbondata.apace.org



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #519: Update description,keep consistent w...

2017-01-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/519#discussion_r95543961
  
--- Diff: README.md ---
@@ -19,10 +19,7 @@
 
 
 
-Apache CarbonData(incubating) is a new big data file format for faster
-interactive query using advanced columnar storage, index, compression
-and encoding techniques to improve computing efficiency, in turn it will 
-help speedup queries an order of magnitude faster over PetaBytes of data. 
+Apache CarbonData(incubating) is an indexed columnar data format for 
faster analytics on big data platform like Apache Hadoop, Apache Spark and so 
on.
--- End diff --

please modify the description in pom.xml also


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #518: [CARBONDATA-622]unify file header re...

2017-01-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/518


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-622) Should use the same fileheader reader for dict generation and data loading

2017-01-11 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-622.
-
Resolution: Fixed

> Should use the same fileheader reader for dict generation and data loading
> --
>
> Key: CARBONDATA-622
> URL: https://issues.apache.org/jira/browse/CARBONDATA-622
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> We can get file header from DDL command and CSV file. 
> 1. If the file header comes from DDL command, separate this file header by 
> comma ","
> 2. if the file header comes from CSV file, sparate this file header by 
> specify delimiter in DDL command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #518: [CARBONDATA-622]unify file header reader

2017-01-11 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/518
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #519: Update description,keep consistent with apa...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/519
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/554/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #519: Update description,keep consistent with apa...

2017-01-11 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/incubator-carbondata/pull/519
  
Build Success with Spark 1.5.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/553/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #520: fix dependency issue for IntelliJ ID...

2017-01-11 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/520

fix dependency issue for IntelliJ IDEA

When using profile spark-2.1, can not run test case of spark-common-test in 
IntelliJ IDEA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixIdeaMavenIssue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #520


commit 74d4bf8933540348525a16fdba361e780fe0f494
Author: QiangCai 
Date:   2017-01-11T08:37:24Z

fix dependency issue for IntelliJ IDEA




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (CARBONDATA-623) If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with sing

2017-01-11 Thread Payal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Payal updated CARBONDATA-623:
-
 Priority: Minor  (was: Major)
Affects Version/s: 1.0.0-incubating
   Attachment: 7000_UniqData.csv

> If we drop table after this condition ---(Firstly we load data in table with 
> single pass true and use kettle false and then in same table load data 2nd 
> time with single pass true and use kettle false ), it is throwing Error: 
> java.lang.NullPointerException
> ---
>
> Key: CARBONDATA-623
> URL: https://issues.apache.org/jira/browse/CARBONDATA-623
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: Payal
>Priority: Minor
> Attachments: 7000_UniqData.csv
>
>
> 1.Firstly we load data in table with single pass true and use kettle false 
> data load successfully and  we are getting result set properly.
> 2.then in same table load data in table with single pass true and use kettle 
> false data load successfully and  we are getting result set properly.
> 3.But after that if we drop the table ,its is throwing null pointer exception.
> Queries
> 0: jdbc:hive2://hadoop-master:1> CREATE TABLE uniqdata_INCLUDEDICTIONARY 
> (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
> timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
> decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, 
> Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 
> 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.13 seconds)
> 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
> 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
> uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='false','USE_KETTLE'
>  ='false');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (22.814 seconds)
> 0: jdbc:hive2://hadoop-master:1> 
> 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> +---+--+
> |  _c0  |
> +---+--+
> | 7002  |
> +---+--+
> 1 row selected (3.055 seconds)
> 0: jdbc:hive2://hadoop-master:1> select  count(CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> +---+--+
> |  _c0  |
> +---+--+
> | 7013  |
> +---+--+
> 1 row selected (0.366 seconds)
> 0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
> 'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
> uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
>  ='false');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (4.837 seconds)
> 0: jdbc:hive2://hadoop-master:1> select  count(CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> ++--+
> |  _c0   |
> ++--+
> | 14026  |
> ++--+
> 1 row selected (0.458 seconds)
> 0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from 
> uniqdata_INCLUDEDICTIONARY ;
> +---+--+
> |  _c0  |
> +---+--+
> | 7002  |
> +---+--+
> 1 row selected (3.173 seconds)
> 0: jdbc:hive2://hadoop-master:1> drop table uniqdata_includedictionary;
> Error: java.lang.NullPointerException (state=,code=0)
> Logs 
> WARN  11-01 12:56:52,722 - Lost task 0.0 in stage 61.0 (TID 1740, 
> hadoop-slave-2): FetchFailed(BlockManagerId(0, hadoop-slave-3, 45331), 
> shuffleId=22, mapId=0, reduceId=0, message=
> org.apache.spark.shuffle.FetchFailedException: Failed to connect to 
> hadoop-slave-3:45331
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323)
>   at 
> org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300)

[jira] [Created] (CARBONDATA-623) If we drop table after this condition ---(Firstly we load data in table with single pass true and use kettle false and then in same table load data 2nd time with sing

2017-01-11 Thread Payal (JIRA)
Payal created CARBONDATA-623:


 Summary: If we drop table after this condition ---(Firstly we load 
data in table with single pass true and use kettle false and then in same table 
load data 2nd time with single pass true and use kettle false ), it is throwing 
Error: java.lang.NullPointerException 
 Key: CARBONDATA-623
 URL: https://issues.apache.org/jira/browse/CARBONDATA-623
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Reporter: Payal


1.Firstly we load data in table with single pass true and use kettle false data 
load successfully and  we are getting result set properly.
2.then in same table load data in table with single pass true and use kettle 
false data load successfully and  we are getting result set properly.
3.But after that if we drop the table ,its is throwing null pointer exception.

Queries

0: jdbc:hive2://hadoop-master:1> CREATE TABLE uniqdata_INCLUDEDICTIONARY 
(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, 
Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED BY 
'org.apache.carbondata.format' 
TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (1.13 seconds)
0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='false','USE_KETTLE'
 ='false');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (22.814 seconds)
0: jdbc:hive2://hadoop-master:1> 
0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from 
uniqdata_INCLUDEDICTIONARY ;
+---+--+
|  _c0  |
+---+--+
| 7002  |
+---+--+
1 row selected (3.055 seconds)
0: jdbc:hive2://hadoop-master:1> select  count(CUST_NAME) from 
uniqdata_INCLUDEDICTIONARY ;
+---+--+
|  _c0  |
+---+--+
| 7013  |
+---+--+
1 row selected (0.366 seconds)
0: jdbc:hive2://hadoop-master:1> LOAD DATA INPATH 
'hdfs://hadoop-master:54311/data/uniqdata/7000_UniqData.csv' into table 
uniqdata_INCLUDEDICTIONARY OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1','SINGLE_PASS'='true','USE_KETTLE'
 ='false');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (4.837 seconds)
0: jdbc:hive2://hadoop-master:1> select  count(CUST_NAME) from 
uniqdata_INCLUDEDICTIONARY ;
++--+
|  _c0   |
++--+
| 14026  |
++--+
1 row selected (0.458 seconds)
0: jdbc:hive2://hadoop-master:1> select count (distinct CUST_NAME) from 
uniqdata_INCLUDEDICTIONARY ;
+---+--+
|  _c0  |
+---+--+
| 7002  |
+---+--+
1 row selected (3.173 seconds)
0: jdbc:hive2://hadoop-master:1> drop table uniqdata_includedictionary;
Error: java.lang.NullPointerException (state=,code=0)




Logs 

WARN  11-01 12:56:52,722 - Lost task 0.0 in stage 61.0 (TID 1740, 
hadoop-slave-2): FetchFailed(BlockManagerId(0, hadoop-slave-3, 45331), 
shuffleId=22, mapId=0, reduceId=0, message=
org.apache.spark.shuffle.FetchFailedException: Failed to connect to 
hadoop-slave-3:45331
at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:323)
at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:300)
at 
org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:51)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at 
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at 
org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:504)
at 
org.apache.spark.sql.execution.aggregate.Tungs