[jira] [Created] (CARBONDATA-293) Add scan_blocklet_num for query statistics

2016-10-09 Thread zhangshunyu (JIRA)
zhangshunyu created CARBONDATA-293:
--

 Summary: Add scan_blocklet_num for query statistics
 Key: CARBONDATA-293
 URL: https://issues.apache.org/jira/browse/CARBONDATA-293
 Project: CarbonData
  Issue Type: Improvement
  Components: data-query
Affects Versions: 0.1.1-incubating
Reporter: zhangshunyu
Assignee: zhangshunyu
 Fix For: 0.2.0-incubating


Add scan_blocklet_num for query statistics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #204: [CARBONDATA-280]Fix the bug that whe...

2016-10-09 Thread Zhangshunyu
GitHub user Zhangshunyu reopened a pull request:

https://github.com/apache/incubator-carbondata/pull/204

[CARBONDATA-280]Fix the bug that when table properties is repeated it only 
set the last one

## Why raise this pr?
When table properties is repeated it only set the last one, for example,
```
CREATE TABLE IF NOT EXISTS carbontable
(ID Int, date Timestamp, country String,
name String, phonetype String, serialname String, salary Int)
STORED BY 'carbondata'
TBLPROPERTIES('DICTIONARY_EXCLUDE'='country','DICTIONARY_INCLUDE'='ID',
'DICTIONARY_EXCLUDE'='phonetype', 'DICTIONARY_INCLUDE'='salary')
```
As we use map to store the properties, only salary is set to 
DICTIONARY_INCLUDE and only phonetype is set to DICTIONARY_EXCLUDE.

## How to solve?
**We should do restrict syntax check that 
'DICTIONARY_EXCLUDE'='country,phonetype' , 'DICTIONARY_INCLUDE'='ID,salary**' 
and if table properties is repeated, throw an  MalformedCarbonCommandException 
to tell the user that Table properties is repeated, so that the user would not 
perform error operation.

## How to test?
Pass the exist test cases and the new test case for this bug.
## Test Result
CI has passed:
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/354/testReport/

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Zhangshunyu/incubator-carbondata tbprop

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/204.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #204


commit 3e0030e04bff9d11f87471684b4b7b7a8d8b6209
Author: Zhangshunyu 
Date:   2016-09-28T04:50:01Z

Fix the bug that when table properties is repeated it only set the last one

commit 1828b2b78b3de9f7fa127cfcc17bf24d6c138640
Author: Zhangshunyu 
Date:   2016-09-28T05:11:12Z

Fix the test case

commit 876828400f02bb68190222e38898ccec29bb2f04
Author: Zhangshunyu 
Date:   2016-09-28T08:34:36Z

Simply

commit a7a03508b494701ec641b66449a2f0df81e2fde0
Author: Zhangshunyu 
Date:   2016-09-28T08:38:57Z

Simply




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #215: [WIP][CARBONDATA-2] Remove kettle fr...

2016-10-09 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/215#discussion_r82504432
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/csvinput/CustomArrayWritable.java
 ---
@@ -0,0 +1,51 @@
+package org.apache.carbondata.hadoop.csvinput;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.nio.charset.Charset;
+import java.util.Arrays;
+
+import org.apache.hadoop.io.Writable;
+
+/**
+ * Created by root1 on 16/4/16.
--- End diff --

please remove


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #216: [CARBONDATA-289]Support MB/M for tab...

2016-10-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/216


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #216: [CARBONDATA-289]Support MB/M for tab...

2016-10-09 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/216#discussion_r82523320
  
--- Diff: docs/DDL-Operations-on-Carbon.md ---
@@ -67,6 +67,14 @@ Here, DICTIONARY_EXCLUDE will exclude dictionary 
creation. This is applicable fo
   ```ruby
   TBLPROPERTIES 
("COLUMN_GROUPS"="(column1,column3),(Column4,Column5,Column6)") 
   ```
+ - **Table Block Size Configuration**
+
+   The block size of one table's files on hdfs can be defined using an int 
value whose size is in MB, the range is form 1MB to 2048MB and the default 
value is 1024MB, if user didn't define this values in ddl, it would use default 
value to set.
+
+  ```ruby
+  TBLPROPERTIES ("TABLE_BLOCKSIZE"="512 MB")
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #216: [CARBONDATA-289]Support MB/M for tab...

2016-10-09 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/216#discussion_r82523040
  
--- Diff: docs/DDL-Operations-on-Carbon.md ---
@@ -67,6 +67,14 @@ Here, DICTIONARY_EXCLUDE will exclude dictionary 
creation. This is applicable fo
   ```ruby
   TBLPROPERTIES 
("COLUMN_GROUPS"="(column1,column3),(Column4,Column5,Column6)") 
   ```
+ - **Table Block Size Configuration**
+
+   The block size of one table's files on hdfs can be defined using an int 
value whose size is in MB, the range is form 1MB to 2048MB and the default 
value is 1024MB, if user didn't define this values in ddl, it would use default 
value to set.
+
+  ```ruby
+  TBLPROPERTIES ("TABLE_BLOCKSIZE"="512 MB")
--- End diff --

please remove the space, I think `512MB` is ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #221: [CARBONDATA-291] All STATISTIC log s...

2016-10-09 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/221


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-292) add COLUMNDICT operation info in DML operation guide

2016-10-09 Thread Jay (JIRA)
Jay created CARBONDATA-292:
--

 Summary: add COLUMNDICT operation info in DML operation guide
 Key: CARBONDATA-292
 URL: https://issues.apache.org/jira/browse/CARBONDATA-292
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jay
Priority: Minor


there is no COLUMNDICT operation guide in DML-Operations-on-Carbon.md, so need 
to add. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #223: [WIP] add infomation for COLUMNDICT ...

2016-10-09 Thread Jay357089
GitHub user Jay357089 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/223

[WIP] add infomation for COLUMNDICT operation

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Jay357089/incubator-carbondata addColDict

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #223


commit edecf440da8f1359bea2e3b62251839403f149d8
Author: Jay357089 
Date:   2016-10-09T09:22:22Z

add COLUMNDICT info




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-291) Some STATISTIC log still present even though disable STATISTIC

2016-10-09 Thread Gin-zhj (JIRA)
Gin-zhj created CARBONDATA-291:
--

 Summary: Some STATISTIC log still present even though disable 
STATISTIC
 Key: CARBONDATA-291
 URL: https://issues.apache.org/jira/browse/CARBONDATA-291
 Project: CarbonData
  Issue Type: Bug
Reporter: Gin-zhj
Assignee: Gin-zhj
Priority: Minor


The following STATISTIC log still present even though disable STATISTIC,

" STATISTIC Time taken for Carbon Optimizer to optimize:  26 "




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #222: [CARBONDATA-221] Fix the bug of inve...

2016-10-09 Thread Zhangshunyu
GitHub user Zhangshunyu opened a pull request:

https://github.com/apache/incubator-carbondata/pull/222

[CARBONDATA-221] Fix the bug of inverted index that store inverted index in 
metadata.

## Why raise this pr?
1. Problem: In current code, inverted index in ddl info is not stored into 
store, and when we restart the cluster, query might mismatch.
2. To fix problem 1, current code set always true to use inverted index, 
and we can not configure inverted index now, this is not reasonable. We should 
fix this problem from its root cause.

## How to solve?
Using the Encoding as the indentifier to check whether using inverted 
index, this Encoding is in thrift format now, so we no need to modify the 
thrift format.

## How to test?
Pass all the test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Zhangshunyu/incubator-carbondata fix_index

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/222.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #222


commit c27a8a9e33529e53020c477c70d0c079724070d2
Author: Zhangshunyu 
Date:   2016-09-08T07:48:03Z

Save useInvertedIndex info into thrift store

commit 3c8da81869e1a8eca8bdde3d82bc0a9d185bdc3d
Author: Zhangshunyu 
Date:   2016-09-08T07:48:15Z

Save useInvertedIndex info into thrift store

commit b834e4889f5c5eadcee1c232c1a6070df0c1bf60
Author: Zhangshunyu 
Date:   2016-09-08T09:46:12Z

Fix the judge of no_dic_col

commit e8b338c2a7a9e3e28a591bdfe57a5f704f1496d6
Author: Zhangshunyu 
Date:   2016-09-08T10:04:20Z

add commont




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #221: use recoder for all statistic log

2016-10-09 Thread foryou2030
GitHub user foryou2030 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/221

use recoder for all statistic log

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/foryou2030/incubator-carbondata off_stat

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/221.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #221


commit 4e1154d538a96d8862a2b66815cb5db1dc7e3ed5
Author: foryou2030 
Date:   2016-10-09T08:41:20Z

use recoder for all statistic log




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-290) When part of table name has database name, then query will show segment path not found

2016-10-09 Thread Jay (JIRA)
Jay created CARBONDATA-290:
--

 Summary: When part of table name has database name, then query 
will show  segment path not found
 Key: CARBONDATA-290
 URL: https://issues.apache.org/jira/browse/CARBONDATA-290
 Project: CarbonData
  Issue Type: Bug
Reporter: Jay
Assignee: Jay
Priority: Minor


When part of table name has database name, ex:
 in default database,  CREATE TABLE IF NOT EXISTS t3default 

then load and then query, we will get the exception that segment not found



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #220: [WIP]Fix path not found when part of...

2016-10-09 Thread Jay357089
GitHub user Jay357089 opened a pull request:

https://github.com/apache/incubator-carbondata/pull/220

[WIP]Fix path not found when part of table name  has database Name

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Jay357089/incubator-carbondata fixpath

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #220


commit 0743dda5653a00122e6e36fcec74f7aa0ad765c0
Author: Jay357089 
Date:   2016-10-09T06:40:02Z

fix Path Not Found when Table has ]DbName




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---