date:20170726

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129765977
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -145,21 +146,31 @@ public static void 
renameBadRecordsFromInProgressToNormal(
   /**
* This method will be used to delete sort temp location is it is exites
*/
-  public static void deleteSortLocationIfExists(String tempFileLocation) {
-// create new temp file location where this class
-//will write all the temp files
-File file = new File(tempFileLocation);
-
-if (file.exists()) {
-  try {
-CarbonUtil.deleteFoldersAndFiles(file);
-  } catch (IOException | InterruptedException e) {
-LOGGER.error(e);
+  public static void deleteSortLocationIfExists(String[] locations) {
+for (String loc : locations) {
+  File file = new File(loc);
+  if (file.exists()) {
+try {
+  CarbonUtil.deleteFoldersAndFiles(file);
+} catch (IOException | InterruptedException e) {
+  LOGGER.error(e, "Failed to delete " + loc);
+}
   }
 }
   }
 
   /**
+   * This method will be used to create dirs
+   * @param locations locations to create
+   */
+  public static void createLocations(String[] locations) {
+for (String loc : locations) {
+  if (new File(loc).mkdirs()) {
--- End diff --

:+1: nice


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129765796
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1296,6 +1296,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+
+  /**
+   * default value for multi temp dir
+   */
+  public static final String CARBON_USING_MULTI_TEMP_DIR_DEFAULT = "false";
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1335) Duplicated & time-consuming method call found in query

2017-07-26 Thread xuchuanyin (JIRA)

xuchuanyin created CARBONDATA-1335:
--

 Summary: Duplicated & time-consuming method call found in query
 Key: CARBONDATA-1335
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1335
 Project: CarbonData
  Issue Type: Improvement
  Components: data-query
Affects Versions: 1.1.1
Reporter: xuchuanyin
Priority: Minor


# Scenario

Currently we did a concurrent  14 queries on Carbondata. The queries are the 
same, but on different tables. We have noticed the following scene:

+ A single query took about 5s;
+ In concurrent scenario, each query took about 15s;

By adding checkpoint in the log, we found that there was great latency in 
starting query jobs in spark.

# Analysts

When we fire a query, Carbondata firstly do some job in the client side, 
including parse/analyze plans and prepare filtered blocks and inputSplits. Then 
Carbondata start to submit query job to spark. 

We found in the first step, Carbondata took about 7s in current scenario, but 
it only took about <1s in single scenario.
By studying the related code, we found the most time consuming method call was  
`CarbonSessionCatalog.lookupRelation`. In side this method, it called 
`super.lookupRelation` twice, which consumed about 3s each time.

# Solution

Carbondata only needs to call the `super.lookupRelation` only once, we need to 
remove the useless duplicated method call.

I've tested in my environment and it works well. In concurrent scenario, each 
query takes about 12s (3s saved for the improvement).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3209/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/614/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread sraghunandan

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129753971
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -145,21 +146,31 @@ public static void 
renameBadRecordsFromInProgressToNormal(
   /**
* This method will be used to delete sort temp location is it is exites
*/
-  public static void deleteSortLocationIfExists(String tempFileLocation) {
-// create new temp file location where this class
-//will write all the temp files
-File file = new File(tempFileLocation);
-
-if (file.exists()) {
-  try {
-CarbonUtil.deleteFoldersAndFiles(file);
-  } catch (IOException | InterruptedException e) {
-LOGGER.error(e);
+  public static void deleteSortLocationIfExists(String[] locations) {
+for (String loc : locations) {
+  File file = new File(loc);
+  if (file.exists()) {
+try {
+  CarbonUtil.deleteFoldersAndFiles(file);
+} catch (IOException | InterruptedException e) {
+  LOGGER.error(e, "Failed to delete " + loc);
+}
   }
 }
   }
 
   /**
+   * This method will be used to create dirs
+   * @param locations locations to create
+   */
+  public static void createLocations(String[] locations) {
+for (String loc : locations) {
+  if (new File(loc).mkdirs()) {
--- End diff --

should it not be !new File(loc).mkdirs()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread sraghunandan

Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1198#discussion_r129753676
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1296,6 +1296,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+
+  /**
+   * default value for multi temp dir
+   */
+  public static final String CARBON_USING_MULTI_TEMP_DIR_DEFAULT = "false";
--- End diff --

change to match the above configuration


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1334) Delete Operation Hung in large dataset

2017-07-26 Thread sounak chakraborty (JIRA)

sounak chakraborty created CARBONDATA-1334:
--

 Summary: Delete Operation Hung in large dataset
 Key: CARBONDATA-1334
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1334
 Project: CarbonData
  Issue Type: Bug
Reporter: sounak chakraborty


Delete operation is hung in large dataset. Due to wrong quals check in 
DeleteDeltaBlockletDetails.java multiple DeleteDeltaBlockDetails objects being 
formed (almost like each object for each delete offset). Due to this high 
object formation search cost became very high which caused the hung situation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3208/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/613/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3207/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/612/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread lionelcao

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129744678
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -308,6 +308,10 @@
   @CarbonProperty
   public static final String NUM_CORES_COMPACTING = 
"carbon.number.of.cores.while.compacting";
   /**
+   * Number of cores to be used while alter partition
+   */
+  public static final String NUM_CORES_ALT_PARTITION = 
"carbon.number.of.cores.while.altPartition";
+  /**
--- End diff --

No space line in other variables here, so keep one style.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1079: [CARBONDATA-1257] Measure Filter implementation

2017-07-26 Thread zzcclp

Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/1079
  
@sounakr @ravipesala   any progress on this pr? it was merged onto 
branch-1.1.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread lionelcao

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129740890
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonPartitionExample.scala
 ---
@@ -101,17 +126,40 @@ object CarbonPartitionExample {
 spark.sql("""
| CREATE TABLE IF NOT EXISTS t5
| (
+   | id Int,
| vin String,
| logdate Timestamp,
| phonenumber Long,
-   | area String
+   | area String,
+   | salary Int
|)
| PARTITIONED BY (country String)
| STORED BY 'carbondata'
| TBLPROPERTIES('PARTITION_TYPE'='LIST',
-   | 'LIST_INFO'='(China,United States),UK ,japan,(Canada,Russia), 
South Korea ')
+   | 'LIST_INFO'='(China, US),UK ,Japan,(Canada,Russia, Good, 
NotGood), Korea ')
--- End diff --

Hi @chenerlu , here in DDL statement, it's designed to leave no space to 
mock real situation which could happen in customer writing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3206/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/611/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/610/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3205/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1197: [CARBONDATA-1238] Decouple the datatype conve...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1197#discussion_r129736254
  
--- Diff: 
integration/spark-common/src/main/java/org/apache/carbondata/spark/util/SparkDataTypeConverterImp.java
 ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util;
+
+import java.io.Serializable;
+
+import org.apache.carbondata.core.util.DataTypeConverter;
+
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * Convert java data type to spark data type
+ */
+public final class SparkDataTypeConverterImp implements DataTypeConverter, 
Serializable {
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1197: [CARBONDATA-1238] Decouple the datatype conve...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1197#discussion_r129736197
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -768,29 +818,6 @@ public QueryModel getQueryModel(InputSplit inputSplit, 
TaskAttemptContext taskAt
 return queryModel;
   }
 
-  public CarbonReadSupport getReadSupportClass(Configuration 
configuration) {
--- End diff --

no any code change, just move "set and get" method to together.  for 
example : put setFilterPredicates and getFilterPredicates to together.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1195: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1195
  
@chenliang613 This PR contains commits from the others by uncorrected 
rebasing. So I close it and create a new one #1198 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #943: [CARBONDATA-1086]Added documentation for BATCH...

2017-07-26 Thread vandana7

Github user vandana7 closed the pull request at:

https://github.com/apache/carbondata/pull/943


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #943: [CARBONDATA-1086]Added documentation for BATCH SORT S...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/943
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/609/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #943: [CARBONDATA-1086]Added documentation for BATCH SORT S...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/943
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3204/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3203/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/608/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap onto master

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3202/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap onto master

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/607/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (CARBONDATA-1333) Fix Coverity_Fortify issue

2017-07-26 Thread Kushal Sah (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101782#comment-16101782
 ] 

Kushal Sah commented on CARBONDATA-1333:


Can this issue be assigned to me

> Fix Coverity_Fortify issue
> --
>
> Key: CARBONDATA-1333
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1333
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Kushal Sah
>Priority: Minor
>
> Fixed coverity and fortify issue detected by the codex tool



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (CARBONDATA-1333) Fix Coverity_Fortify issue

2017-07-26 Thread Kushal Sah (JIRA)

Kushal Sah created CARBONDATA-1333:
--

 Summary: Fix Coverity_Fortify issue
 Key: CARBONDATA-1333
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1333
 Project: CarbonData
  Issue Type: Improvement
Reporter: Kushal Sah
Priority: Minor


Fixed coverity and fortify issue detected by the codex tool



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129598686
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -440,9 +510,16 @@ protected Expression getFilterPredicates(Configuration 
configuration) {
 for (Map.Entry entry :
 segmentIndexMap.entrySet()) {
   SegmentTaskIndexStore.TaskBucketHolder taskHolder = 
entry.getKey();
-  int taskId = 
CarbonTablePath.DataFileUtil.getTaskIdFromTaskNo(taskHolder.taskNo);
+  int partitionId = 
CarbonTablePath.DataFileUtil.getTaskIdFromTaskNo(taskHolder.taskNo);
+  //oldPartitionIdList is only used in alter table partition 
command because it change
+  //partition info first and then read data.
+  //for other normal query should use newest partitionIdList
--- End diff --

use /** */ instead if multi line notes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129598531
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -440,9 +510,16 @@ protected Expression getFilterPredicates(Configuration 
configuration) {
 for (Map.Entry entry :
 segmentIndexMap.entrySet()) {
   SegmentTaskIndexStore.TaskBucketHolder taskHolder = 
entry.getKey();
-  int taskId = 
CarbonTablePath.DataFileUtil.getTaskIdFromTaskNo(taskHolder.taskNo);
+  int partitionId = 
CarbonTablePath.DataFileUtil.getTaskIdFromTaskNo(taskHolder.taskNo);
+  //oldPartitionIdList is only used in alter table partition 
command because it change
+  //partition info first and then read data.
+  //for other normal query should use newest partitionIdList
--- End diff --

use /** */  instead


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1197: [CARBONDATA-1238] Decouple the datatype conve...

2017-07-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1197#discussion_r129597130
  
--- Diff: 
integration/spark-common/src/main/java/org/apache/carbondata/spark/util/SparkDataTypeConverterImp.java
 ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util;
+
+import java.io.Serializable;
+
+import org.apache.carbondata.core.util.DataTypeConverter;
+
+import org.apache.spark.unsafe.types.UTF8String;
+
+/**
+ * Convert java data type to spark data type
+ */
+public final class SparkDataTypeConverterImp implements DataTypeConverter, 
Serializable {
--- End diff --

type, `SparkDataTypeConverterImpl`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1197: [CARBONDATA-1238] Decouple the datatype conve...

2017-07-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1197#discussion_r129596893
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -768,29 +818,6 @@ public QueryModel getQueryModel(InputSplit inputSplit, 
TaskAttemptContext taskAt
 return queryModel;
   }
 
-  public CarbonReadSupport getReadSupportClass(Configuration 
configuration) {
--- End diff --

any change for this method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap onto master

2017-07-26 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1193: [CARBONDATA-1327] Add carbon sort column exam...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1193#discussion_r129592396
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSortColumnsExample.scala
 ---
@@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.examples
+
+import java.io.File
+
+import org.apache.spark.sql.SparkSession
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+object CarbonSortColumnsExample {
+
+  def main(args: Array[String]) {
+val rootPath = new File(this.getClass.getResource("/").getPath
++ "../../../..").getCanonicalPath
+val storeLocation = s"$rootPath/examples/spark2/target/store"
+val warehouse = s"$rootPath/examples/spark2/target/warehouse"
+val metastoredb = s"$rootPath/examples/spark2/target"
+
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 
"/MM/dd HH:mm:ss")
+  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
+
+import org.apache.spark.sql.CarbonSession._
+val spark = SparkSession
+  .builder()
+  .master("local")
+  .appName("CarbonSortColumnsExample")
+  .config("spark.sql.warehouse.dir", warehouse)
+  .config("spark.driver.host", "localhost")
+  .getOrCreateCarbonSession(storeLocation, metastoredb)
+
+spark.sparkContext.setLogLevel("WARN")
+
+spark.sql("DROP TABLE IF EXISTS sort_columns_table")
+
+// Create table with no sort columns
+spark.sql(
+  s"""
+ | CREATE TABLE no_sort_columns_table(
+ | shortField SHORT,
+ | intField INT,
+ | bigintField LONG,
+ | doubleField DOUBLE,
+ | stringField STRING,
+ | timestampField TIMESTAMP,
+ | decimalField DECIMAL(18,2),
+ | dateField DATE,
+ | charField CHAR(5),
+ | floatField FLOAT,
+ | complexData ARRAY
+ | )
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='')
+   """.stripMargin)
+
+// Create table with sort columns
+// Currently sort_column don't support "FLOAD, DOUBLE, DECIMAL"
+// but can support other numeric type(like: INT, LONG)
--- End diff --

How about changing the below comments 
from 
// Currently sort_column don't support "FLOAD, DOUBLE, DECIMAL"
 +// but can support other numeric type(like: INT, LONG)
to 
// you can specify any columns to sort columns for building MDX index, 
remark: currently sort columns don't support "FLOAT, DOUBLE, DECIMAL"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-1323) Presto Performace Improvement at Integration Layer

2017-07-26 Thread Liang Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang Chen resolved CARBONDATA-1323.

   Resolution: Fixed
 Assignee: Bhavya Aggarwal
Fix Version/s: 1.2.0

> Presto Performace Improvement at Integration Layer
> --
>
> Key: CARBONDATA-1323
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1323
> Project: CarbonData
>  Issue Type: Improvement
>  Components: presto-integration
>Affects Versions: 1.2.0
>Reporter: Bhavya Aggarwal
>Assignee: Bhavya Aggarwal
> Fix For: 1.2.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Presto Performace Improvement 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1190: [CARBONDATA-1323] Presto Optimization for Int...

2017-07-26 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1190


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1190: [CARBONDATA-1323] Presto Optimization for Int...

2017-07-26 Thread bhavya411

Github user bhavya411 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1190#discussion_r129588108
  
--- Diff: integration/presto/pom.xml ---
@@ -228,6 +228,33 @@
 true
   
 
+
+  org.scala-tools
+  maven-scala-plugin
--- End diff --

I have written the dictionary decoding in scala as it is more optimized and 
easier to understand, hence we have to add this plugin for compiling the scala 
code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/606/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3201/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/605/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129584623
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonPartitionExample.scala
 ---
@@ -101,17 +126,40 @@ object CarbonPartitionExample {
 spark.sql("""
| CREATE TABLE IF NOT EXISTS t5
| (
+   | id Int,
| vin String,
| logdate Timestamp,
| phonenumber Long,
-   | area String
+   | area String,
+   | salary Int
|)
| PARTITIONED BY (country String)
| STORED BY 'carbondata'
| TBLPROPERTIES('PARTITION_TYPE'='LIST',
-   | 'LIST_INFO'='(China,United States),UK ,japan,(Canada,Russia), 
South Korea ')
+   | 'LIST_INFO'='(China, US),UK ,Japan,(Canada,Russia, Good, 
NotGood), Korea ')
--- End diff --

add space before ,


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3200/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129583765
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.iterator;
+
+import org.apache.carbondata.common.CarbonIterator;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.scan.result.BatchResult;
+
+
+public class PartitionSpliterRawResultIterator extends 
CarbonIterator {
+
+  private CarbonIterator iterator;
+  private BatchResult batch;
+  private int counter;
+
+  /**
+   * LOGGER
+   */
+  private static final LogService LOGGER =
+  
LogServiceFactory.getLogService(PartitionSpliterRawResultIterator.class.getName());
+
+  public PartitionSpliterRawResultIterator(CarbonIterator 
iterator) {
+this.iterator = iterator;
+  }
+
+
+  @Override public boolean hasNext() {
+if (null == batch || checkBatchEnd(batch)) {
+  if (iterator.hasNext()) {
+batch = iterator.next();
+counter = 0;
+  } else {
+return false;
+  }
+}
+
+if (!checkBatchEnd(batch)) {
+  return true;
+} else {
+  return false;
+}
+  }
+
+  @Override public Object[] next() {
+if (batch == null) {
+  batch = iterator.next();
+}
+if (!checkBatchEnd(batch)) {
+  try {
+return batch.getRawRow(counter++);
+  } catch (Exception e) {
+LOGGER.error(e.getMessage());
+return null;
+  }
+} else {
+  batch = iterator.next();
+  counter = 0;
+}
+try {
+  return batch.getRawRow(counter++);
+} catch (Exception e) {
+  LOGGER.error(e.getMessage());
+  return null;
--- End diff --

This logical can be optimized.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129583246
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.iterator;
+
+import org.apache.carbondata.common.CarbonIterator;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.scan.result.BatchResult;
+
+
+public class PartitionSpliterRawResultIterator extends 
CarbonIterator {
+
+  private CarbonIterator iterator;
+  private BatchResult batch;
+  private int counter;
+
+  /**
+   * LOGGER
+   */
+  private static final LogService LOGGER =
+  
LogServiceFactory.getLogService(PartitionSpliterRawResultIterator.class.getName());
+
+  public PartitionSpliterRawResultIterator(CarbonIterator 
iterator) {
+this.iterator = iterator;
+  }
+
+
+  @Override public boolean hasNext() {
+if (null == batch || checkBatchEnd(batch)) {
+  if (iterator.hasNext()) {
+batch = iterator.next();
+counter = 0;
+  } else {
+return false;
+  }
+}
+
+if (!checkBatchEnd(batch)) {
+  return true;
+} else {
+  return false;
+}
--- End diff --

use return !checkBatchEnd(batch) instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread lionelcao

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129583064
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/PartitionInfo.java
 ---
@@ -65,6 +65,31 @@ public PartitionInfo(List 
columnSchemaList, PartitionType partitio
 this.partitionIds = new ArrayList<>();
   }
 
+  /**
+   * add partition means split default partition, add in last directly
--- End diff --

because maybe there is data existed in default partition need to be filled 
in new partition


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
There is no useful information in the compilation message.

retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129582365
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.iterator;
+
+import org.apache.carbondata.common.CarbonIterator;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.scan.result.BatchResult;
+
+
--- End diff --

delete space line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129582258
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.iterator;
+
+import org.apache.carbondata.common.CarbonIterator;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.scan.result.BatchResult;
+
+
+public class PartitionSpliterRawResultIterator extends 
CarbonIterator {
+
+  private CarbonIterator iterator;
+  private BatchResult batch;
+  private int counter;
+
+  /**
+   * LOGGER
+   */
--- End diff --

I think this is not necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129582111
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/iterator/PartitionSpliterRawResultIterator.java
 ---
@@ -0,0 +1,95 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.result.iterator;
+
+import org.apache.carbondata.common.CarbonIterator;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.scan.result.BatchResult;
+
+
+public class PartitionSpliterRawResultIterator extends 
CarbonIterator {
+
+  private CarbonIterator iterator;
+  private BatchResult batch;
+  private int counter;
+
+  /**
+   * LOGGER
+   */
+  private static final LogService LOGGER =
+  
LogServiceFactory.getLogService(PartitionSpliterRawResultIterator.class.getName());
+
+  public PartitionSpliterRawResultIterator(CarbonIterator 
iterator) {
+this.iterator = iterator;
+  }
+
+
--- End diff --

delete useless space line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread xuchuanyin

GitHub user xuchuanyin reopened a pull request:

https://github.com/apache/carbondata/pull/1198

[CARBONDATA-1281] Support multiple temp dirs for writing files while 
loading data

# Modifications
This feature mainly focus on avoiding disk hot-spot in single massive data 
loading, changes are made in two parts: 

1. randomly choose a yarn local folder while writing sort temp file each 
time in sort-process;

2.randomly choose a yarn local folder while writing carbondata file each 
time in write-process.

# Usage

To enable this feature, user should enable `carbon.use.multi.temp.dir=true` 
and `carbon.use.local.dir=true`.

# Performance
In my case, this feature improves the loading performance from 35M/s/node 
to 70+M/s/node


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata new_feature_mtd4l

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1198.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1198


commit 46da65a1a0579c62a7f4196ae622f83dd5197e3a
Author: xuchuanyin 
Date:   2017-07-25T11:17:53Z

Support multiple temp dirs for writing files while loading data

randomly choose a dir to write sort temp files

randomly choose a dir to write carbondata files

Fix errors in spelling

optimize default value for using multiple temp dir

update document for multiple temp dirs feature

update property name

(cherry picked from commit 71ab293ef8d2ff24a122bb074b7b95bca8c1b77e)

commit 6e35dec70196a12aaac24a69c795d3597f946386
Author: xuchuanyin 
Date:   2017-07-25T11:20:32Z

Add tests for multiple temp dirs during data loading

Fix bugs in tests

remove header in test data

remove useless comment

remove added useless testdata

update data source for tests

(cherry picked from commit ee355b78c0d703d5bc2d2767837c32b6cc422361)

commit 3e633070c3f793867c03ba350048994ced0e5527
Author: xuchuanyin 
Date:   2017-07-25T12:28:17Z

resolve review comments

+ update documents
+ update parameter name
+ optimize code to avoid duplicate lines

commit 9f746178600d7c16267bd0276b8a492f69871802
Author: xuchuanyin 
Date:   2017-07-25T12:42:35Z

fix checkstyle error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin closed the pull request at:

https://github.com/apache/carbondata/pull/1198


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3199/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/604/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1195: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1195
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3198/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1195: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1195
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/603/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
I created this PR and closed #1195


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1195: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1195
  
@chenliang613 
Sorry for adding irrelevant commits to this PR by uncorrected rebasing. 
:disappointed: 

I've created a new PR #1198 for this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1332) Dictionary generation time in spark 2.1 is more than spark 1.5

2017-07-26 Thread Venkata Ramana G (JIRA)

Venkata Ramana G created CARBONDATA-1332:


 Summary: Dictionary generation time in spark 2.1 is more than 
spark 1.5
 Key: CARBONDATA-1332
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1332
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.1.1
Reporter: Venkata Ramana G
 Fix For: 1.2.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata issue #1193: [CARBONDATA-1327] Add carbon sort column examples

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1193
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/602/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1193: [CARBONDATA-1327] Add carbon sort column examples

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1193
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3197/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1198: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread xuchuanyin

GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/1198

[CARBONDATA-1281] Support multiple temp dirs for writing files while 
loading data

# Modifications
This feature mainly focus on avoiding disk hot-spot in single massive data 
loading, changes are made in two parts: 

1. randomly choose a yarn local folder while writing sort temp file each 
time in sort-process;

2.randomly choose a yarn local folder while writing carbondata file each 
time in write-process.

# Usage

To enable this feature, user should enable `carbon.use.multi.temp.dir=true` 
and `carbon.use.local.dir=true`.

# Performance
In my case, this feature improves the loading performance from 35M/s/node 
to 70+M/s/node


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata new_feature_mtd4l

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1198.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1198


commit 46da65a1a0579c62a7f4196ae622f83dd5197e3a
Author: xuchuanyin 
Date:   2017-07-25T11:17:53Z

Support multiple temp dirs for writing files while loading data

randomly choose a dir to write sort temp files

randomly choose a dir to write carbondata files

Fix errors in spelling

optimize default value for using multiple temp dir

update document for multiple temp dirs feature

update property name

(cherry picked from commit 71ab293ef8d2ff24a122bb074b7b95bca8c1b77e)

commit 6e35dec70196a12aaac24a69c795d3597f946386
Author: xuchuanyin 
Date:   2017-07-25T11:20:32Z

Add tests for multiple temp dirs during data loading

Fix bugs in tests

remove header in test data

remove useless comment

remove added useless testdata

update data source for tests

(cherry picked from commit ee355b78c0d703d5bc2d2767837c32b6cc422361)

commit 3e633070c3f793867c03ba350048994ced0e5527
Author: xuchuanyin 
Date:   2017-07-25T12:28:17Z

resolve review comments

+ update documents
+ update parameter name
+ optimize code to avoid duplicate lines

commit 9f746178600d7c16267bd0276b8a492f69871802
Author: xuchuanyin 
Date:   2017-07-25T12:42:35Z

fix checkstyle error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1198: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1198
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1331) Fixed failing test cases

2017-07-26 Thread Mohammad Shahid Khan (JIRA)

Mohammad Shahid Khan created CARBONDATA-1331:


 Summary: Fixed failing test cases
 Key: CARBONDATA-1331
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1331
 Project: CarbonData
  Issue Type: Bug
Reporter: Mohammad Shahid Khan
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] carbondata pull request #1195: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-26 Thread xuchuanyin

Github user xuchuanyin closed the pull request at:

https://github.com/apache/carbondata/pull/1195


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1193: [CARBONDATA-1327] Add carbon sort column examples

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1193
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/601/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1193: [CARBONDATA-1327] Add carbon sort column examples

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1193
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3196/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1193: [CARBONDATA-1327] Add carbon sort column exam...

2017-07-26 Thread mayunSaicmotor

Github user mayunSaicmotor commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1193#discussion_r129566524
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSortColumnsExample.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.examples
+
+import java.io.File
+
+import org.apache.spark.sql.SparkSession
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+object CarbonSortColumnsExample {
+
+  def main(args: Array[String]) {
+val rootPath = new File(this.getClass.getResource("/").getPath
++ "../../../..").getCanonicalPath
+val storeLocation = s"$rootPath/examples/spark2/target/store"
+val warehouse = s"$rootPath/examples/spark2/target/warehouse"
+val metastoredb = s"$rootPath/examples/spark2/target"
+
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 
"/MM/dd HH:mm:ss")
+  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
+  
.addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true")
+
+import org.apache.spark.sql.CarbonSession._
+val spark = SparkSession
+  .builder()
+  .master("local")
+  .appName("CarbonSortColumnsExample")
+  .config("spark.sql.warehouse.dir", warehouse)
+  .config("spark.driver.host", "localhost")
+  .getOrCreateCarbonSession(storeLocation, metastoredb)
+
+spark.sparkContext.setLogLevel("WARN")
+
+spark.sql("DROP TABLE IF EXISTS sort_columns_table")
+
+// Create table with no sort columns
+spark.sql(
+  s"""
+ | CREATE TABLE no_sort_columns_table(
+ | shortField SHORT,
+ | intField INT,
+ | bigintField LONG,
+ | doubleField DOUBLE,
+ | stringField STRING,
+ | timestampField TIMESTAMP,
+ | decimalField DECIMAL(18,2),
+ | dateField DATE,
+ | charField CHAR(5),
+ | floatField FLOAT,
+ | complexData ARRAY
+ | )
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='')
+   """.stripMargin)
+
+// Create table with sort columns
+spark.sql(
+  s"""
+ | CREATE TABLE sort_columns_table(
+ | shortField SHORT,
+ | intField INT,
+ | bigintField LONG,
+ | doubleField DOUBLE,
+ | stringField STRING,
+ | timestampField TIMESTAMP,
+ | decimalField DECIMAL(18,2),
+ | dateField DATE,
+ | charField CHAR(5),
+ | floatField FLOAT,
+ | complexData ARRAY
+ | )
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='intField, stringField, charField')
+   """.stripMargin)
+
+val path = s"$rootPath/examples/spark2/src/main/resources/data.csv"
+
+// scalastyle:off
+spark.sql(
+  s"""
+ | LOAD DATA LOCAL INPATH '$path'
+ | INTO TABLE no_sort_columns_table
+ | 
OPTIONS('FILEHEADER'='shortField,intField,bigintField,doubleField,stringField,timestampField,decimalField,dateField,charField,floatField,complexData',
+ | 'COMPLEX_DELIMITER_LEVEL_1'='#')
--- End diff --

added comments in line 74 and line 75


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1193: [CARBONDATA-1327] Add carbon sort column examples

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1193
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3195/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1193: [CARBONDATA-1327] Add carbon sort column examples

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1193
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/600/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/599/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3194/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129534273
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/metadata/schema/PartitionInfo.java
 ---
@@ -65,6 +65,31 @@ public PartitionInfo(List 
columnSchemaList, PartitionType partitio
 this.partitionIds = new ArrayList<>();
   }
 
+  /**
+   * add partition means split default partition, add in last directly
--- End diff --

default partition is 0, so why split partition ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenerlu

Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129533186
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -308,6 +308,10 @@
   @CarbonProperty
   public static final String NUM_CORES_COMPACTING = 
"carbon.number.of.cores.while.compacting";
   /**
+   * Number of cores to be used while alter partition
+   */
+  public static final String NUM_CORES_ALT_PARTITION = 
"carbon.number.of.cores.while.altPartition";
+  /**
--- End diff --

Add spaceline


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Failed with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/598/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1110: [CARBONDATA-1238] Decouple the datatype convert in c...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1110
  
raise a new PR #1197 , close the old one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3193/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1110: [CARBONDATA-1238] Decouple the datatype conve...

2017-07-26 Thread chenliang613

Github user chenliang613 closed the pull request at:

https://github.com/apache/carbondata/pull/1110


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1197: [CARBONDATA-1238] Decouple the datatype convert from...

2017-07-26 Thread asfgit

Github user asfgit commented on the issue:

https://github.com/apache/carbondata/pull/1197
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1197: [CARBONDATA-1238] Decouple the datatype conve...

2017-07-26 Thread chenliang613

GitHub user chenliang613 opened a pull request:

https://github.com/apache/carbondata/pull/1197

[CARBONDATA-1238] Decouple the datatype convert from Spark code in core 
module

Decouple the datatype convert from Spark code in core module.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chenliang613/carbondata decouple_sparkcode

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1197.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1197


commit 45e1d4c6cdf131743d0558e302ecd77a1aa9ef32
Author: chenliang613 
Date:   2017-06-28T15:45:50Z

[CARBONDATA-1238] Decouple the datatype convert from Spark code in core 
module

[CARBONDATA-1238] Decouple the datatype convert from Spark code in core 
module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1190: [CARBONDATA-1323] Presto Optimization for Int...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1190#discussion_r129528008
  
--- Diff: integration/presto/pom.xml ---
@@ -228,6 +228,33 @@
 true
   
 
+
+  org.scala-tools
+  maven-scala-plugin
--- End diff --

can you explain, why need add the plugin to pom file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1195: [CARBONDATA-1281] Support multiple temp dirs for wri...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1195
  
@xuchuanyin  everything looks ok, please do rebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread lionelcao

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129516143
  
--- Diff: 
integration/spark-common-test/src/test/resources/partition_data.csv ---
@@ -0,0 +1,27 @@
+id,vin,logdate,phonenumber,country,area,salary
--- End diff --

Oh, this file is copied from example package. Maybe I can reduce them and 
keep only one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread lionelcao

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129515426
  
--- Diff: conf/carbon.properties.template ---
@@ -42,6 +42,9 @@ carbon.enableXXHash=true
 #carbon.max.level.cache.size=-1
 #enable prefetch of data during merge sort while reading data from sort 
temp files in data loading
 #carbon.merge.sort.prefetch=true
+ Alter Partition Configuration 
+#Number of cores to be used while alter partition
+carbon.number.of.cores.while.altPartition=2
--- End diff --

Yes, it will be used when take action of multiple segments in parallel. 
this configuration will allow user to set the threads according to their 
hardware.
Sure, I will make the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread lionelcao

Github user lionelcao commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129514953
  
--- Diff: 
integration/spark-common-test/src/test/resources/partition_data.csv ---
@@ -0,0 +1,27 @@
+id,vin,logdate,phonenumber,country,area,salary
--- End diff --

Hi @chenliang613 this csv data is already existed for partition example and 
test case. It's simple and clear to understand the partition concept. this PR 
just added two columns.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1192: [CARBONDATA-940] alter table add/split partition for...

2017-07-26 Thread lionelcao

Github user lionelcao commented on the issue:

https://github.com/apache/carbondata/pull/1192
  
# Feature Description
This feature is to support ADD & SPLIT partition function on CarbonData.
# Scope
Support range partition and list partition table
# Syntax Example
Suppose one carbon table is list partitioned on COUNTRY column.
Current partition definition is ('China', 'US', 'UK', 'India', 'Canada, 
Japan, South Korea, North Korea')
### add a partition
ALTER TABLE t1 ADD PARTITION('Russia')
### split a partition
ALTER TABLE t1 SPLIT PARTITION(5) INTO ('Canada', 'Japan', '(South Korea, 
North Korea)')

# Modification
### parser
added new parser to support alter table add/split partition statement
### validate new RangeInfo and ListInfo
ensure new rangeInfo after adding/splitting is in correct order
ensure new added listInfo is not existed before
ensure the target split listInfo could be split
### read target partition data
add function to read data in one segment and one partition
### use ALTER_PARTITION as key of temp directions
add isAltPartitionFlow in getTempStoreLocationKey function
### repartition and write data
decode the partition column and repartition
write to new data blocks
### refresh cache
drop old cache
### multi threads operation in different segments
support make the changing of multiple segments in parallel.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129514137
  
--- Diff: conf/carbon.properties.template ---
@@ -42,6 +42,9 @@ carbon.enableXXHash=true
 #carbon.max.level.cache.size=-1
 #enable prefetch of data during merge sort while reading data from sort 
temp files in data loading
 #carbon.merge.sort.prefetch=true
+ Alter Partition Configuration 
+#Number of cores to be used while alter partition
+carbon.number.of.cores.while.altPartition=2
--- End diff --

1. Please check whether the parameter  
"carbon.number.of.cores.while.altPartition=2" is necessary , or not ?
2. If yes, suggest directly using : 
carbon.number.of.cores.while.alterPartition


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1192: [CARBONDATA-940] alter table add/split partit...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1192#discussion_r129513477
  
--- Diff: 
integration/spark-common-test/src/test/resources/partition_data.csv ---
@@ -0,0 +1,27 @@
+id,vin,logdate,phonenumber,country,area,salary
--- End diff --

can you try to reuse the current csv files or generate data. 
Don't suggest adding so many csv file to repo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1193: [CARBONDATA-1327] Add carbon sort column exam...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1193#discussion_r129511350
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSortColumnsExample.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.examples
+
+import java.io.File
+
+import org.apache.spark.sql.SparkSession
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+object CarbonSortColumnsExample {
+
+  def main(args: Array[String]) {
+val rootPath = new File(this.getClass.getResource("/").getPath
++ "../../../..").getCanonicalPath
+val storeLocation = s"$rootPath/examples/spark2/target/store"
+val warehouse = s"$rootPath/examples/spark2/target/warehouse"
+val metastoredb = s"$rootPath/examples/spark2/target"
+
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 
"/MM/dd HH:mm:ss")
+  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
+  
.addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true")
+
+import org.apache.spark.sql.CarbonSession._
+val spark = SparkSession
+  .builder()
+  .master("local")
+  .appName("CarbonSortColumnsExample")
+  .config("spark.sql.warehouse.dir", warehouse)
+  .config("spark.driver.host", "localhost")
+  .getOrCreateCarbonSession(storeLocation, metastoredb)
+
+spark.sparkContext.setLogLevel("WARN")
+
+spark.sql("DROP TABLE IF EXISTS sort_columns_table")
+
+// Create table with no sort columns
+spark.sql(
+  s"""
+ | CREATE TABLE no_sort_columns_table(
+ | shortField SHORT,
+ | intField INT,
+ | bigintField LONG,
+ | doubleField DOUBLE,
+ | stringField STRING,
+ | timestampField TIMESTAMP,
+ | decimalField DECIMAL(18,2),
+ | dateField DATE,
+ | charField CHAR(5),
+ | floatField FLOAT,
+ | complexData ARRAY
+ | )
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='')
+   """.stripMargin)
+
+// Create table with sort columns
+spark.sql(
+  s"""
+ | CREATE TABLE sort_columns_table(
+ | shortField SHORT,
+ | intField INT,
+ | bigintField LONG,
+ | doubleField DOUBLE,
+ | stringField STRING,
+ | timestampField TIMESTAMP,
+ | decimalField DECIMAL(18,2),
+ | dateField DATE,
+ | charField CHAR(5),
+ | floatField FLOAT,
+ | complexData ARRAY
+ | )
+ | STORED BY 'carbondata'
+ | TBLPROPERTIES('SORT_COLUMNS'='intField, stringField, charField')
+   """.stripMargin)
+
+val path = s"$rootPath/examples/spark2/src/main/resources/data.csv"
+
+// scalastyle:off
+spark.sql(
+  s"""
+ | LOAD DATA LOCAL INPATH '$path'
+ | INTO TABLE no_sort_columns_table
+ | 
OPTIONS('FILEHEADER'='shortField,intField,bigintField,doubleField,stringField,timestampField,decimalField,dateField,charField,floatField,complexData',
+ | 'COMPLEX_DELIMITER_LEVEL_1'='#')
--- End diff --

Currently, sort_column don't support "float,double,decimal", please add the 
comment in this example, but can support other numeric type(like : int,long)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1196: Rebase datamap onto master

2017-07-26 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1196
  
please change the PR name to : Rebase datamap branch onto master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1193: [CARBONDATA-1327] Add carbon sort column exam...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1193#discussion_r129509969
  
--- Diff: 
examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSortColumnsExample.scala
 ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.examples
+
+import java.io.File
+
+import org.apache.spark.sql.SparkSession
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+object CarbonSortColumnsExample {
+
+  def main(args: Array[String]) {
+val rootPath = new File(this.getClass.getResource("/").getPath
++ "../../../..").getCanonicalPath
+val storeLocation = s"$rootPath/examples/spark2/target/store"
+val warehouse = s"$rootPath/examples/spark2/target/warehouse"
+val metastoredb = s"$rootPath/examples/spark2/target"
+
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 
"/MM/dd HH:mm:ss")
+  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
+  
.addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true")
--- End diff --

can you explain ,why add : 
addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true")


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1193: [CARBONDATA-1327] Add carbon sort column exam...

2017-07-26 Thread chenliang613

Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1193#discussion_r129508099
  
--- Diff: docs/ddl-operation-on-carbondata.md ---
@@ -101,6 +101,14 @@ The following DDL operations are supported in 
CarbonData :
 
- All dimensions except complex datatype columns are part of multi 
dimensional key(MDK). This behavior can be overridden by using TBLPROPERTIES. 
If the user wants to keep any column (except columns of complex datatype) in 
multi dimensional key then he can keep the columns either in DICTIONARY_EXCLUDE 
or DICTIONARY_INCLUDE.
 
+   - **Sort Columns Configuration**
+
+  It is used to specify the  multi dimensional key(MDK) columns. By 
default MDK is composed of all dimension columns except complex datatype 
column. 
+
--- End diff --

here, need give the description for "SORT_COLUMN" property:
"SORT_COLUMN"  property is for users to specify which columns belong to the 
MDK index.
If user don't specify "SORT_COLUMN" property, by default MDK index be built 
by using all dimension columns except complex datatype column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #943: [CARBONDATA-1086]Added documentation for BATCH SORT S...

2017-07-26 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/943
  
Build Success with Spark 1.6, Please check CI 
http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/597/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

1 2 >

1 - 100 of 104 matches

Mail list logo