Jenkins build is still unstable: carbondata-master-spark-2.2 #1681

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Spark Common Test #1681

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Store SDK #1681

2019-05-16 Thread Apache Jenkins Server
See 




[carbondata] branch master updated: [DOC] Update doc for sort_columns modification

2019-05-16 Thread kunalkapoor
This is an automated email from the ASF dual-hosted git repository.

kunalkapoor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new fa91d02  [DOC] Update doc for sort_columns modification
fa91d02 is described below

commit fa91d02af8d8f4cba26ff5bd7e0581c592b50573
Author: QiangCai 
AuthorDate: Mon May 6 10:39:19 2019 +0800

[DOC] Update doc for sort_columns modification

Update doc for sort_columns modification

This closes #3203
---
 docs/ddl-of-carbondata.md | 21 +
 1 file changed, 21 insertions(+)

diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 88615a2..5bc8f10 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -793,6 +793,27 @@ Users can specify which columns to include and exclude for 
local dictionary gene
ALTER TABLE tablename UNSET TBLPROPERTIES('SORT_SCOPE')
```
 
+ - # SORT COLUMNS
+   Example to SET SORT COLUMNS:
+   ```
+   ALTER TABLE tablename SET TBLPROPERTIES('SORT_COLUMNS'='column1')
+   ```
+   After this operation, the new loading will use the new SORT_COLUMNS. 
The user can adjust 
+   the SORT_COLUMNS according to the query, but it will not impact the old 
data directly. So 
+   it will not impact the query performance of the old data segments which 
are not sorted by 
+   new SORT_COLUMNS.  
+   
+   UNSET is not supported, but it can set SORT_COLUMNS to empty string 
instead of using UNSET.
+   ```
+   ALTER TABLE tablename SET TBLPROPERTIES('SORT_COLUMNS'='')
+   ```
+
+   **NOTE:**
+* The future version will enhance "custom" compaction to sort the old 
segment one by one.
+* The streaming table is not supported for SORT_COLUMNS modification.
+* If the inverted index columns are removed from the new SORT_COLUMNS, 
they will not 
+create the inverted index. But the old configuration of INVERTED_INDEX 
will be kept.
+
 ### DROP TABLE
 
   This command is used to delete an existing table.



[carbondata] branch master updated: [CARBONDATA-3362] Document update for pagesize table property scenario

2019-05-16 Thread kunalkapoor
This is an automated email from the ASF dual-hosted git repository.

kunalkapoor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new bf3ce9d  [CARBONDATA-3362] Document update for pagesize table property 
scenario
bf3ce9d is described below

commit bf3ce9d557d2ccf3656e5a9b7152955360cddaae
Author: ajantha-bhat 
AuthorDate: Tue May 7 14:36:05 2019 +0530

[CARBONDATA-3362] Document update for pagesize table property scenario

Document update for pagesize table property scenario.

This closes #3206
---
 docs/carbon-as-spark-datasource-guide.md | 2 +-
 docs/ddl-of-carbondata.md| 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/carbon-as-spark-datasource-guide.md 
b/docs/carbon-as-spark-datasource-guide.md
index 598acb0..fe46b09 100644
--- a/docs/carbon-as-spark-datasource-guide.md
+++ b/docs/carbon-as-spark-datasource-guide.md
@@ -44,7 +44,7 @@ Now you can create Carbon table using Spark's datasource DDL 
syntax.
 |---|--||
 | table_blocksize | 1024 | Size of blocks to write onto hdfs. For  more 
details, see [Table Block Size 
Configuration](./ddl-of-carbondata.md#table-block-size-configuration). |
 | table_blocklet_size | 64 | Size of blocklet to write. |
-| table_page_size_inmb | 0 | Size of each page in carbon table, if page size 
crosses this value before 32000 rows, page will be cut to that may rows. Helps 
in keep page size to fit cache size |
+| table_page_size_inmb | 0 | Size of each page in carbon table, if page size 
crosses this value before 32000 rows, page will be cut to that many rows. Helps 
in keep page size to fit cache size |
 | local_dictionary_threshold | 1 | Cardinality upto which the local 
dictionary can be generated. For  more details, see [Local Dictionary 
Configuration](./ddl-of-carbondata.md#local-dictionary-configuration). |
 | local_dictionary_enable | false | Enable local dictionary generation. For  
more details, see [Local Dictionary 
Configuration](./ddl-of-carbondata.md#local-dictionary-configuration). |
 | sort_columns | all dimensions are sorted | Columns to include in sort and 
its order of sort. For  more details, see [Sort Columns 
Configuration](./ddl-of-carbondata.md#sort-columns-configuration). |
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 5bc8f10..34eca8d 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -291,6 +291,11 @@ CarbonData DDL statements are documented here,which 
includes:
  If page size crosses this value before 32000 rows, page will be cut to 
that many rows. 
  Helps in keeping page size to fit cpu cache size.
 
+ This property can be configured if the table has string, varchar, binary 
or complex datatype columns.
+ Because for these columns 32000 rows in one page may exceed 1755 MB and 
snappy compression will fail in that scenario.
+ Also if page size is huge, page cannot be fit in CPU cache. 
+ So, configuring smaller values of this property (say 1 MB) can result in 
better use of CPU cache for pages.
+
  Example usage:
  ```
  TBLPROPERTIES ('TABLE_PAGE_SIZE_INMB'='5')



[carbondata] branch master updated: [CARBONDATA-3374] Optimize documentation and fix some spell errors.

2019-05-16 Thread kunalkapoor
This is an automated email from the ASF dual-hosted git repository.

kunalkapoor pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new ed17abb  [CARBONDATA-3374] Optimize documentation and fix some spell 
errors.
ed17abb is described below

commit ed17abbff1fa08d60085979a6c42b7b569dae141
Author: xubo245 
AuthorDate: Tue May 7 20:47:13 2019 +0800

[CARBONDATA-3374] Optimize documentation and fix some spell errors.

Optimize documentation and fix some spell errors.

This closes #3207
---
 .../apache/carbondata/core/datamap/dev/DataMapFactory.java   |  4 ++--
 .../carbondata/core/indexstore/BlockletDetailsFetcher.java   |  4 ++--
 .../indexstore/blockletindex/BlockletDataMapFactory.java |  2 +-
 .../carbondata/core/indexstore/schema/SchemaGenerator.java   |  2 +-
 .../apache/carbondata/core/util/path/CarbonTablePath.java|  2 +-
 .../carbondata/datamap/lucene/LuceneDataMapFactoryBase.java  |  2 +-
 .../datamap/lucene/LuceneFineGrainDataMapFactory.java|  2 +-
 docs/carbon-as-spark-datasource-guide.md |  2 +-
 docs/ddl-of-carbondata.md| 12 +++-
 .../spark/testsuite/dataload/TestLoadDataGeneral.scala   |  4 ++--
 .../spark/testsuite/datamap/CGDataMapTestCase.scala  |  4 ++--
 .../spark/testsuite/datamap/DataMapWriterSuite.scala |  2 +-
 .../spark/testsuite/datamap/FGDataMapTestCase.scala  |  4 ++--
 .../apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala   |  2 +-
 .../org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala   |  6 +++---
 .../execution/datasources/SparkCarbonFileFormat.scala|  3 ++-
 .../scala/org/apache/spark/sql/CarbonCatalystOperators.scala |  4 ++--
 .../execution/command/management/CarbonLoadDataCommand.scala |  2 +-
 .../scala/org/apache/spark/sql/optimizer/CarbonFilters.scala |  2 +-
 19 files changed, 34 insertions(+), 31 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
index ee7914d..b32a482 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
@@ -88,7 +88,7 @@ public abstract class DataMapFactory {
   }
 
   /**
-   * Get the datamap for segmentid
+   * Get the datamap for segmentId
*/
   public abstract List getDataMaps(Segment segment) throws IOException;
 
@@ -99,7 +99,7 @@ public abstract class DataMapFactory {
   throws IOException;
 
   /**
-   * Get all distributable objects of a segmentid
+   * Get all distributable objects of a segmentId
* @return
*/
   public abstract List toDistributable(Segment segment);
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
index 1971f40..ae01e9e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
@@ -27,7 +27,7 @@ import org.apache.carbondata.core.datamap.Segment;
 public interface BlockletDetailsFetcher {
 
   /**
-   * Get the blocklet detail information based on blockletid, blockid and 
segmentid.
+   * Get the blocklet detail information based on blockletid, blockid and 
segmentId.
*
* @param blocklets
* @param segment
@@ -38,7 +38,7 @@ public interface BlockletDetailsFetcher {
   throws IOException;
 
   /**
-   * Get the blocklet detail information based on blockletid, blockid and 
segmentid.
+   * Get the blocklet detail information based on blockletid, blockid and 
segmentId.
*
* @param blocklet
* @param segment
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index 2ef7b88..93be06e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -185,7 +185,7 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
   }
 
   /**
-   * Get the blocklet detail information based on blockletid, blockid and 
segmentid. This method is
+   * Get the blocklet detail information based on blockletid, blockid and 
segmentId. This method is
* exclusively for BlockletDataMapFactory as detail information is only 
available in this
* default datamap.
*/
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
 
b/core/src/main/java/o

Jenkins build is still unstable: carbondata-master-spark-2.1 #3524

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build became unstable: carbondata-master-spark-2.1 » Apache CarbonData :: Spark2 #3524

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.1 » Apache CarbonData :: Store SDK #3524

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Store SDK #1682

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Spark Common Test #1682

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 #1682

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 #1684

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Store SDK #1684

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Spark Common Test #1684

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Store SDK #1683

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Spark Common Test #1683

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 #1683

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : carbondata-master-spark-2.1 » Apache CarbonData :: Spark2 #3525

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.1 » Apache CarbonData :: Store SDK #3525

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.1 #3525

2019-05-16 Thread Apache Jenkins Server
See 




[carbondata] branch master updated: [CARBONDATA-3377] Fix for Null pointer exception in Range Col compaction

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 9932a6d  [CARBONDATA-3377] Fix for Null pointer exception in Range Col 
compaction
9932a6d is described below

commit 9932a6d768c2f6bff70ac861d20a403cecc640b5
Author: manishnalla1994 
AuthorDate: Fri May 10 15:43:10 2019 +0530

[CARBONDATA-3377] Fix for Null pointer exception in Range Col compaction

Problem : String Type Column with huge strings and null values fails giving 
NullPointerException when it is a range column and compaction is done.

Solution : Added a check in StringOrdering for null values.

This closes #3212
---
 .../core/constants/CarbonCommonConstants.java  |  4 +++
 .../carbondata/core/util/CarbonProperties.java |  6 
 .../dataload/TestRangeColumnDataLoad.scala | 42 +-
 .../spark/load/DataLoadProcessBuilderOnSpark.scala | 16 ++---
 .../carbondata/spark/rdd/CarbonMergerRDD.scala |  7 ++--
 5 files changed, 67 insertions(+), 8 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index ba8e20a..43544cb 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1193,6 +1193,10 @@ public final class CarbonCommonConstants {
 
   public static final String CARBON_RANGE_COLUMN_SCALE_FACTOR_DEFAULT = "3";
 
+  public static final String CARBON_ENABLE_RANGE_COMPACTION = 
"carbon.enable.range.compaction";
+
+  public static final String CARBON_ENABLE_RANGE_COMPACTION_DEFAULT = "false";
+
   
//
   // Query parameter start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
index 004a51e..e26f3d8 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
@@ -1507,6 +1507,12 @@ public final class CarbonProperties {
 return Boolean.parseBoolean(pushFilters);
   }
 
+  public boolean isRangeCompactionAllowed() {
+String isRangeCompact = 
getProperty(CarbonCommonConstants.CARBON_ENABLE_RANGE_COMPACTION,
+CarbonCommonConstants.CARBON_ENABLE_RANGE_COMPACTION_DEFAULT);
+return Boolean.parseBoolean(isRangeCompact);
+  }
+
   private void validateSortMemorySpillPercentage() {
 String spillPercentageStr = carbonProperties.getProperty(
 CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE,
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
index 5d6730f..165e4f8 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
@@ -610,6 +610,34 @@ class TestRangeColumnDataLoad extends QueryTest with 
BeforeAndAfterEach with Bef
 sql("DROP TABLE IF EXISTS carbon_range_column1")
   }
 
+  test("Test compaction for range_column - STRING Datatype null values") {
+sql("DROP TABLE IF EXISTS carbon_range_column1")
+deleteFile(filePath2)
+createFile(filePath2, 20, 14)
+sql(
+  """
+| CREATE TABLE carbon_range_column1(id INT, name STRING, city STRING, 
age LONG)
+| STORED BY 'org.apache.carbondata.format'
+| TBLPROPERTIES('SORT_SCOPE'='LOCAL_SORT', 'SORT_COLUMNS'='city',
+| 'range_column'='city')
+  """.stripMargin)
+
+sql(s"LOAD DATA LOCAL INPATH '$filePath2' INTO TABLE carbon_range_column1 
" +
+"OPTIONS('BAD_RECORDS_ACTION'='FORCE','HEADER'='false')")
+
+sql(s"LOAD DATA LOCAL INPATH '$filePath2' INTO TABLE carbon_range_column1 
" +
+"OPTIONS('BAD_RECORDS_ACTION'='FORCE','HEADER'='false')")
+
+var res = sql("select * from carbon_range_column1").collect()
+
+sql("ALTER TABLE carbon_range_column1 COMPACT 'MAJOR'")
+
+checkAnswer(sql("select * from carbon_range_column1"), res)
+
+sql("DROP TABLE IF EXISTS carbon_range_column1")
+deleteFile(filePath2)
+  }
+
   test("Test compaction for range_column - STRING Datatype min/max not 
stored") {
 deleteFile(filePath2)
 createFile(filePath2, 1000, 

[carbondata] branch master updated: [CARBONDATA-3391] Count star output is wrong when BLOCKLET CACHE is enabled

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 5a4684f  [CARBONDATA-3391] Count star output is wrong when BLOCKLET 
CACHE is enabled
5a4684f is described below

commit 5a4684f7f085fbad08f37b1cb1c6974b886e0e4c
Author: BJangir 
AuthorDate: Thu May 16 14:53:21 2019 +0530

[CARBONDATA-3391] Count star output is wrong when BLOCKLET CACHE is enabled

Wrong Cont(*) value when blocklet cache is enabled
Root cause :- blockletToRowCountMap has key with segmentNo+carbonfile name 
. so when carbonfile has multiple blocklets , blockletToRowCountMap overrites 
existing rowcount.

Solution :- update the existing rowcount if any.

This closes #3225
---
 .../carbondata/core/indexstore/blockletindex/BlockDataMap.java| 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index 1fc5831..13e612d 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -689,7 +689,13 @@ public class BlockDataMap extends CoarseGrainDataMap
   CarbonCommonConstants.DEFAULT_CHARSET_CLASS) + 
CarbonTablePath.getCarbonDataExtension();
   int rowCount = dataMapRow.getInt(ROW_COUNT_INDEX);
   // prepend segment number with the blocklet file path
-  blockletToRowCountMap.put((segment.getSegmentNo() + "," + fileName), 
(long) rowCount);
+  String blockletMapKey = segment.getSegmentNo() + "," + fileName;
+  Long existingCount = blockletToRowCountMap.get(blockletMapKey);
+  if (null != existingCount) {
+blockletToRowCountMap.put(blockletMapKey, (long) rowCount + 
existingCount);
+  } else {
+blockletToRowCountMap.put(blockletMapKey, (long) rowCount);
+  }
 }
 return blockletToRowCountMap;
   }



[carbondata] 01/22: [CARBONDATA-3341] fixed invalid NULL result in filter query

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9a54c88aa7e65ae08fd1fd36c53c8fa0a02db1a3
Author: kunal642 
AuthorDate: Thu Apr 4 11:53:05 2019 +0530

[CARBONDATA-3341] fixed invalid NULL result in filter query

Problem: When vector filter push down is true and the table contains a null 
value
then thegetNullBitSet method is giving an byte[]to represent null.
But there is no check for the value of the bitset.

Solution: Check if null bit set length is 0 then set the same to the 
chunkData.

This closes #3172
---
 .../core/datastore/chunk/store/ColumnPageWrapper.java  |  7 ++-
 .../spark/testsuite/sortcolumns/TestSortColumns.scala  | 14 ++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/ColumnPageWrapper.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/ColumnPageWrapper.java
index a1c4aec..f4d3fe4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/ColumnPageWrapper.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/ColumnPageWrapper.java
@@ -261,7 +261,12 @@ public class ColumnPageWrapper implements 
DimensionColumnPage {
   // if the compare value is null and the data is also null we can 
directly return 0
   return 0;
 } else {
-  byte[] chunkData = this.getChunkDataInBytes(rowId);
+  byte[] chunkData;
+  if (nullBitSet != null && nullBitSet.length == 0) {
+chunkData = nullBitSet;
+  } else {
+chunkData = this.getChunkDataInBytes(rowId);
+  }
   return ByteUtil.UnsafeComparer.INSTANCE.compareTo(chunkData, 
compareValue);
 }
   }
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
index df97d0f..bbd58c0 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
@@ -385,6 +385,17 @@ class TestSortColumns extends QueryTest with 
BeforeAndAfterAll {
 "sort_columns is unsupported for double datatype column: empno"))
   }
 
+  test("test if equal to 0 filter on sort column gives correct result") {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_PUSH_ROW_FILTERS_FOR_VECTOR,
+  "true")
+sql("create table test1(a bigint) stored by 'carbondata' 
TBLPROPERTIES('sort_columns'='a')")
+sql("insert into test1 select 'k'")
+sql("insert into test1 select '1'")
+assert(sql("select * from test1 where a = 1 or a = 0").count() == 1)
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_PUSH_ROW_FILTERS_FOR_VECTOR,
+  CarbonCommonConstants.CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT)
+  }
+
   override def afterAll = {
 dropTestTables
 CarbonProperties.getInstance().addProperty(
@@ -392,9 +403,12 @@ class TestSortColumns extends QueryTest with 
BeforeAndAfterAll {
 CarbonProperties.getInstance()
   .addProperty(CarbonCommonConstants.LOAD_SORT_SCOPE,
 CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT)
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_PUSH_ROW_FILTERS_FOR_VECTOR,
+  CarbonCommonConstants.CARBON_PUSH_ROW_FILTERS_FOR_VECTOR_DEFAULT)
   }
 
   def dropTestTables = {
+sql("drop table if exists test1")
 sql("drop table if exists sortint")
 sql("drop table if exists sortint1")
 sql("drop table if exists sortlong")



[carbondata] 03/22: [CARBONDATA-3331] Fix for external table in Show Metacache

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f0e270667b0be7c7b66d922ddc176ebe4334a695
Author: namanrastogi 
AuthorDate: Tue Mar 26 19:39:26 2019 +0530

[CARBONDATA-3331] Fix for external table in Show Metacache

Problem: In SHOW METACACHE command when an external table is queries upon, 
database size is more than ALL size for index column. External table is created 
upon a store on a table which is also present in current database.

Bug: Index size for database was being summed up blindly for all ths 
tables. So some cache entries were being summed up multiple times.

Solution: Compute the database index size while iterating over cache, when 
cache key contains path of one of the table's path.

This closes #3164
---
 .../command/cache/CarbonShowCacheCommand.scala | 91 --
 1 file changed, 52 insertions(+), 39 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
index 8461bf3..3b85313 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -20,7 +20,6 @@ package org.apache.spark.sql.execution.command.cache
 import scala.collection.mutable
 import scala.collection.JavaConverters._
 
-import org.apache.hadoop.mapred.JobConf
 import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
@@ -28,16 +27,13 @@ import 
org.apache.spark.sql.catalyst.expressions.AttributeReference
 import org.apache.spark.sql.execution.command.{Checker, MetadataCommand}
 import org.apache.spark.sql.types.StringType
 
-import org.apache.carbondata.core.cache.{CacheProvider, CacheType}
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.cache.CacheProvider
 import org.apache.carbondata.core.cache.dictionary.AbstractColumnDictionaryInfo
-import org.apache.carbondata.core.constants.CarbonCommonConstants
-import org.apache.carbondata.core.datamap.Segment
 import org.apache.carbondata.core.indexstore.BlockletDataMapIndexWrapper
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
-import org.apache.carbondata.core.readcommitter.LatestFilesReadCommittedScope
 import org.apache.carbondata.datamap.bloom.BloomCacheKeyValue
 import org.apache.carbondata.events.{OperationContext, OperationListenerBus, 
ShowTableCacheEvent}
-import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
 import org.apache.carbondata.spark.util.CommonUtil.bytesToDisplaySize
 
 
@@ -45,6 +41,8 @@ case class CarbonShowCacheCommand(tableIdentifier: 
Option[TableIdentifier],
 internalCall: Boolean = false)
   extends MetadataCommand {
 
+  val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
   override def output: Seq[AttributeReference] = {
 if (tableIdentifier.isEmpty) {
   Seq(
@@ -71,38 +69,48 @@ case class CarbonShowCacheCommand(tableIdentifier: 
Option[TableIdentifier],
 Row("ALL", "ALL", 0L, 0L, 0L),
 Row(currentDatabase, "ALL", 0L, 0L, 0L))
 } else {
-  val carbonTables = CarbonEnv.getInstance(sparkSession).carbonMetaStore
-.listAllTables(sparkSession).filter {
-carbonTable =>
-  carbonTable.getDatabaseName.equalsIgnoreCase(currentDatabase) &&
-  isValidTable(carbonTable, sparkSession) &&
-  !carbonTable.isChildDataMap
+  var carbonTables = mutable.ArrayBuffer[CarbonTable]()
+  sparkSession.sessionState.catalog.listTables(currentDatabase).foreach {
+tableIdent =>
+  try {
+val carbonTable = 
CarbonEnv.getCarbonTable(tableIdent)(sparkSession)
+if (!carbonTable.isChildDataMap) {
+  carbonTables += carbonTable
+}
+  } catch {
+case ex: NoSuchTableException =>
+  LOGGER.debug("Ignoring non-carbon table " + tableIdent.table)
+  }
   }
 
   // All tables of current database
-  var (dbIndexSize, dbDatamapSize, dbDictSize) = (0L, 0L, 0L)
-  val tableList: Seq[Row] = carbonTables.map {
+  var (dbDatamapSize, dbDictSize) = (0L, 0L)
+  val tableList = carbonTables.flatMap {
 carbonTable =>
-  val tableResult = getTableCache(sparkSession, carbonTable)
-  var (indexSize, datamapSize) = (tableResult(0).getLong(1), 0L)
-  tableResult.drop(2).foreach {
-row =>
-  indexSize += row.getLong(1)
-  datamapSize += row.getLong(

[carbondata] branch branch-1.5 updated (4f95559 -> ea1e86c)

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


from 4f95559  [maven-release-plugin] prepare for next development iteration
 new 9a54c88  [CARBONDATA-3341] fixed invalid NULL result in filter query
 new eeb7e3a  [CARBONDATA-3001] configurable page size in MB
 new f0e2706  [CARBONDATA-3331] Fix for external table in Show Metacache
 new d1bb3a0  [CARBONDATA-3334] fixed multiple segment file issue for 
partition
 new 743b843  [CARBONDATA-3353 ]Fixed MinMax Based Pruning for Measure 
column in case of Legacy store
 new 4f7f17d  [CARBONDATA-3344] Fix Drop column not present in table
 new 69b8873  [CARBONDATA-3351] Support Binary Data Type
 new d1b455f  [CARBONDATA-3348] Support alter SORT_COLUMNS property
 new 9a9c791  [CARBONDATA-3353 ][HOTFIX]Fixed MinMax Based Pruning for 
Measure column in case of Legacy store
 new 9f23d2c  [HOTFIX] support compact segments with different sort_columns
 new f7cdb47  [CARBONDATA-3359]Fix data mismatch issue for decimal column 
after delete operation
 new f80a28d  [CARBONDATA-3345]A growing streaming ROW_V1 carbondata file 
would be ingored some InputSplits
 new 7449c34  [CARBONDATA-3343] Compaction for Range Sort
 new bc80a22  [CARBONDATA-3360]fix NullPointerException in delete and clean 
files operation
 new f46ad43  [CARBONDATA-3369] Fix issues during concurrent execution of 
Create table If not exists
 new d8b0ff4  [CARBONDATA-3371] Fix ArrayIndexOutOfBoundsException of 
compaction after sort_columns modification
 new 7e7792e  [CARBONDATA-3375] [CARBONDATA-3376] Fix GC Overhead limit 
exceeded issue and partition column as range column issue
 new 4d21b6a  [DOC] Update doc for sort_columns modification
 new 251cbdc  [CARBONDATA-3362] Document update for pagesize table property 
scenario
 new b42f1ac  [CARBONDATA-3374] Optimize documentation and fix some spell 
errors.
 new 4abed04  [CARBONDATA-3377] Fix for Null pointer exception in Range Col 
compaction
 new ea1e86c  [CARBONDATA-3391] Count star output is wrong when BLOCKLET 
CACHE is enabled

The 22 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../core/constants/CarbonCommonConstants.java  |   25 +
 .../carbondata/core/datamap/DataMapFilter.java |   89 ++
 .../carbondata/core/datamap/TableDataMap.java  |   91 +-
 .../core/datamap/dev/DataMapFactory.java   |4 +-
 .../datamap/dev/expr/DataMapExprWrapperImpl.java   |3 +-
 .../block/SegmentPropertiesAndSchemaHolder.java|   13 +-
 .../core/datastore/blocklet/EncodedBlocklet.java   |   19 +
 .../datastore/chunk/store/ColumnPageWrapper.java   |7 +-
 .../safe/AbstractNonDictionaryVectorFiller.java|2 +-
 .../SafeVariableLengthDimensionDataChunkStore.java |2 +-
 .../carbondata/core/datastore/page/ColumnPage.java |   11 +-
 .../core/datastore/page/LazyColumnPage.java|2 +
 .../datastore/page/UnsafeVarLengthColumnPage.java  |7 +-
 .../datastore/page/VarLengthColumnPageBase.java|1 +
 .../page/encoding/ColumnPageEncoderMeta.java   |4 +-
 .../page/encoding/DefaultEncodingFactory.java  |   14 +-
 .../carbondata/core/datastore/row/CarbonRow.java   |6 +-
 .../core/indexstore/BlockletDetailsFetcher.java|4 +-
 .../indexstore/blockletindex/BlockDataMap.java |   23 +-
 .../blockletindex/BlockletDataMapFactory.java  |2 +-
 .../blockletindex/BlockletDataRefNode.java |6 +-
 .../core/indexstore/schema/SchemaGenerator.java|2 +-
 .../core/metadata/blocklet/BlockletInfo.java   |   10 +
 .../ThriftWrapperSchemaConverterImpl.java  |4 +
 .../datatype/{StringType.java => BinaryType.java}  |   14 +-
 .../core/metadata/datatype/DataType.java   |2 +-
 .../core/metadata/datatype/DataTypes.java  |4 +
 .../metadata/datatype/DecimalConverterFactory.java |   55 +-
 .../core/metadata/schema/table/CarbonTable.java|   44 +-
 .../core/metadata/schema/table/TableInfo.java  |   23 +
 .../metadata/schema/table/TableSchemaBuilder.java  |   11 +
 .../carbondata/core/mutate/CarbonUpdateUtil.java   |   48 +-
 .../scan/executor/impl/AbstractQueryExecutor.java  |   83 +-
 .../executor/impl/QueryExecutorProperties.java |5 -
 .../core/scan/executor/util/QueryUtil.java |2 +-
 .../core/scan/executor/util/RestructureUtil.java   |   77 +-
 .../core/scan/expression/Expression.java   |   13 +
 .../scan/filter/FilterExpressionProcessor.java |5 +-
 .../carbondata/core/scan/filter/FilterUtil.java|   60 +-
 .../filter/executer/IncludeFilterExecuterImpl.java |9 +-
 .../RowLevelRangeGrtThanFiterExecut

[carbondata] 15/22: [CARBONDATA-3369] Fix issues during concurrent execution of Create table If not exists

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f46ad430e1f97ee1dc659a2676048671374cf394
Author: KanakaKumar 
AuthorDate: Fri May 3 22:05:16 2019 +0530

[CARBONDATA-3369] Fix issues during concurrent execution of Create table If 
not exists

Create table if not exists has following problems if run concurrently from 
different drivers
Sometimes It fails with error Table  already exists.
Create table failed driver still holds the table with wrong path or schema. 
Eventual operations refer the wrong path
Stale path created during create table is not deleted for ever [After 1.5.0 
version table will be created in a new folder using UUID if folder with table 
name already exists]
This PR fixes above 3 issues.

This closes #3198
---
 .../createTable/TestCreateTableIfNotExists.scala   | 36 ++
 .../command/table/CarbonCreateTableCommand.scala   | 33 +++-
 2 files changed, 68 insertions(+), 1 deletion(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableIfNotExists.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableIfNotExists.scala
index 8f7afe4..dc54127 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableIfNotExists.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableIfNotExists.scala
@@ -17,6 +17,8 @@
 
 package org.apache.carbondata.spark.testsuite.createTable
 
+import java.util.concurrent.{Callable, ExecutorService, Executors, Future, 
TimeUnit}
+
 import org.apache.spark.sql.test.util.QueryTest
 import org.scalatest.BeforeAndAfterAll
 
@@ -51,11 +53,45 @@ class TestCreateTableIfNotExists extends QueryTest with 
BeforeAndAfterAll {
 assert(exception.getMessage.contains("Operation not allowed, when source 
table is carbon table"))
   }
 
+  test("test create table if not exist concurrently") {
+
+val executorService: ExecutorService = Executors.newFixedThreadPool(10)
+var futures: List[Future[_]] = List()
+for (i <- 0 until (3)) {
+  futures = futures :+ runAsync()
+}
+
+executorService.shutdown();
+executorService.awaitTermination(30L, TimeUnit.SECONDS)
+
+futures.foreach { future =>
+  assertResult("PASS")(future.get.toString)
+}
+
+def runAsync(): Future[String] = {
+  executorService.submit(new Callable[String] {
+override def call() = {
+  // Create table
+  var result = "PASS"
+  try {
+sql("create table IF NOT EXISTS TestIfExists(name string) stored 
by 'carbondata'")
+  } catch {
+case exception: Exception =>
+  result = exception.getMessage
+  }
+  result
+}
+  })
+}
+  }
+
+
   override def afterAll {
 sql("use default")
 sql("drop table if exists test")
 sql("drop table if exists sourceTable")
 sql("drop table if exists targetTable")
+sql("drop table if exists TestIfExists")
   }
 
 }
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
index 1e17ffe..debb283 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
@@ -21,6 +21,7 @@ import scala.collection.JavaConverters._
 
 import org.apache.spark.sql.{CarbonEnv, Row, SparkSession, _}
 import org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException
+import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.execution.SQLExecution.EXECUTION_ID_KEY
 import org.apache.spark.sql.execution.command.MetadataCommand
 
@@ -166,7 +167,37 @@ case class CarbonCreateTableCommand(
  """.stripMargin)
   }
 } catch {
-  case e: AnalysisException => throw e
+  case e: AnalysisException =>
+// AnalysisException thrown with table already exists msg incase 
of conurrent drivers
+if (e.getMessage().contains("already exists")) {
+
+  // Clear the cache first
+  CarbonEnv.getInstance(sparkSession).carbonMetaStore
+.removeTableFromMetadata(dbName, tableName)
+
+  // Delete the folders created by this call if the actual path is 
different
+  val actualPath = CarbonEnv
+.getCarbonTable(TableIdentifier(tableName, 
Option(dbName)))(sparkSession)
+.getT

[carbondata] 11/22: [CARBONDATA-3359]Fix data mismatch issue for decimal column after delete operation

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f7cdb47e7535c7543147bf96f42cfbc14b36b082
Author: akashrn5 
AuthorDate: Thu Apr 25 15:16:35 2019 +0530

[CARBONDATA-3359]Fix data mismatch issue for decimal column after delete 
operation

Problem:
after delete operation is performed, the decimal column data is wrong. This 
is because, during filling vector for decimal column,
 we were not considering the deleted rows if present any, we were filling 
all the row data for decimal.

Solution
in case of decimal, get the vector from ColumnarVectorWrapperDirectFactory 
and then put data, which will take care of the deleted rows

This closes #3189
---
 .../metadata/datatype/DecimalConverterFactory.java | 55 +-
 .../src/test/resources/decimalData.csv |  4 ++
 .../testsuite/iud/DeleteCarbonTableTestCase.scala  | 17 +++
 3 files changed, 54 insertions(+), 22 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/datatype/DecimalConverterFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/datatype/DecimalConverterFactory.java
index 9793c38..2e155f4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/datatype/DecimalConverterFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/datatype/DecimalConverterFactory.java
@@ -23,6 +23,7 @@ import java.util.BitSet;
 
 import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
 import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.ColumnarVectorWrapperDirectFactory;
 import org.apache.carbondata.core.util.ByteUtil;
 import org.apache.carbondata.core.util.DataTypeUtil;
 
@@ -102,13 +103,13 @@ public final class DecimalConverterFactory {
   return BigDecimal.valueOf((Long) valueToBeConverted, scale);
 }
 
-@Override public void fillVector(Object valuesToBeConverted, int size, 
ColumnVectorInfo info,
-BitSet nullBitset, DataType pageType) {
+@Override public void fillVector(Object valuesToBeConverted, int size,
+ColumnVectorInfo vectorInfo, BitSet nullBitSet, DataType pageType) {
   // TODO we need to find way to directly set to vector with out 
conversion. This way is very
   // inefficient.
-  CarbonColumnVector vector = info.vector;
-  int precision = info.measure.getMeasure().getPrecision();
-  int newMeasureScale = info.measure.getMeasure().getScale();
+  CarbonColumnVector vector = getCarbonColumnVector(vectorInfo, 
nullBitSet);
+  int precision = vectorInfo.measure.getMeasure().getPrecision();
+  int newMeasureScale = vectorInfo.measure.getMeasure().getScale();
   if (!(valuesToBeConverted instanceof byte[])) {
 throw new UnsupportedOperationException("This object type " + 
valuesToBeConverted.getClass()
 + " is not supported in this method");
@@ -116,7 +117,7 @@ public final class DecimalConverterFactory {
   byte[] data = (byte[]) valuesToBeConverted;
   if (pageType == DataTypes.BYTE) {
 for (int i = 0; i < size; i++) {
-  if (nullBitset.get(i)) {
+  if (nullBitSet.get(i)) {
 vector.putNull(i);
   } else {
 BigDecimal value = BigDecimal.valueOf(data[i], scale);
@@ -128,7 +129,7 @@ public final class DecimalConverterFactory {
 }
   } else if (pageType == DataTypes.SHORT) {
 for (int i = 0; i < size; i++) {
-  if (nullBitset.get(i)) {
+  if (nullBitSet.get(i)) {
 vector.putNull(i);
   } else {
 BigDecimal value = BigDecimal
@@ -142,7 +143,7 @@ public final class DecimalConverterFactory {
 }
   } else if (pageType == DataTypes.SHORT_INT) {
 for (int i = 0; i < size; i++) {
-  if (nullBitset.get(i)) {
+  if (nullBitSet.get(i)) {
 vector.putNull(i);
   } else {
 BigDecimal value = BigDecimal
@@ -156,7 +157,7 @@ public final class DecimalConverterFactory {
 }
   } else if (pageType == DataTypes.INT) {
 for (int i = 0; i < size; i++) {
-  if (nullBitset.get(i)) {
+  if (nullBitSet.get(i)) {
 vector.putNull(i);
   } else {
 BigDecimal value = BigDecimal
@@ -170,7 +171,7 @@ public final class DecimalConverterFactory {
 }
   } else if (pageType == DataTypes.LONG) {
 for (int i = 0; i < size; i++) {
-  if (nullBitset.get(i)) {
+  if (nullBitSet.get(i)) {
 vector.putNull(i);
   } else {
 BigDecimal value = BigDecimal
@@ -261,18 +262,18 @@ public final class DecimalConverterFactory {
   return new BigDecimal(bigInteger, scale);
 }
 
-@Override public void fill

[carbondata] 02/22: [CARBONDATA-3001] configurable page size in MB

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit eeb7e3a9b52f07e8298091252638175328a7aa9c
Author: ajantha-bhat 
AuthorDate: Mon Oct 15 18:49:33 2018 +0530

[CARBONDATA-3001] configurable page size in MB

Changes proposed in this PR:

supported a table property table_page_size_inmb (1 MB to 1755 MB), 
configurable page size for each table in below scenarios.

TBLProperties in creating table
API in SdkWriter
Options in creating table using spark file format
Options in DataFrameWriter
If this table properties is not configured, Default vaue will be taken (1 
MB) [currently no default value, will set in next version].
Based on this property value. Page will be cut if it crosses the value 
before 32000 rows. This enables in fitting pages into cache.

This closes #2814
---
 .../core/constants/CarbonCommonConstants.java  |  14 ++
 .../core/datastore/blocklet/EncodedBlocklet.java   |  19 +++
 .../blockletindex/BlockletDataRefNode.java |   6 +-
 .../core/metadata/blocklet/BlockletInfo.java   |  10 ++
 .../metadata/schema/table/TableSchemaBuilder.java  |  10 ++
 .../carbondata/core/util/CarbonMetadataUtil.java   |  15 +-
 .../core/util/DataFileFooterConverterV3.java   |   8 +
 docs/carbon-as-spark-datasource-guide.md   |   1 +
 docs/ddl-of-carbondata.md  |  13 ++
 docs/sdk-guide.md  |   1 +
 format/src/main/thrift/carbondata.thrift   |   1 +
 .../TestCreateTableWithPageSizeInMb.scala  |  67 
 .../TestNonTransactionalCarbonTable.scala  |  49 ++
 .../org/apache/carbondata/spark/CarbonOption.scala |   2 +
 .../apache/carbondata/spark/util/CommonUtil.scala  |  32 
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala|   3 +-
 .../datasources/CarbonSparkDataSourceUtil.scala|   4 +
 .../apache/spark/sql/CarbonDataFrameWriter.scala   |   1 +
 .../table/CarbonDescribeFormattedCommand.scala |   9 +-
 .../sql/CarbonGetTableDetailComandTestCase.scala   |   0
 .../processing/datatypes/ArrayDataType.java|  15 ++
 .../processing/datatypes/GenericDataType.java  |   5 +
 .../processing/datatypes/PrimitiveDataType.java|   6 +
 .../processing/datatypes/StructDataType.java   |  14 ++
 .../store/CarbonFactDataHandlerColumnar.java   | 190 -
 .../store/CarbonFactDataHandlerModel.java  | 106 ++--
 .../carbondata/processing/store/TablePage.java |  36 +---
 .../writer/v3/CarbonFactDataWriterImplV3.java  |   7 +
 .../carbondata/sdk/file/CarbonWriterBuilder.java   |  26 ++-
 29 files changed, 540 insertions(+), 130 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 69374ad..e02241e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1999,6 +1999,20 @@ public final class CarbonCommonConstants {
*/
   public static final int CARBON_ALLOW_DIRECT_FILL_DICT_COLS_LIMIT = 100;
 
+  /**
+   * page size in mb. If page size exceeds this value before 32000 rows count, 
page will be cut.
+   * And remaining rows will written in next page.
+   */
+  public static final String TABLE_PAGE_SIZE_INMB = "table_page_size_inmb";
+
+  public static final int TABLE_PAGE_SIZE_MIN_INMB = 1;
+
+  // default 1 MB
+  public static final int TABLE_PAGE_SIZE_INMB_DEFAULT = 1;
+
+  // As due to SnappyCompressor.MAX_BYTE_TO_COMPRESS is 1.75 GB
+  public static final int TABLE_PAGE_SIZE_MAX_INMB = 1755;
+
   
//
   // Unused constants and parameters start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java
index d017145..8a19522 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/blocklet/EncodedBlocklet.java
@@ -63,6 +63,11 @@ public class EncodedBlocklet {
   private int numberOfPages;
 
   /**
+   * row count in each page
+   */
+  private List rowCountInPage;
+
+  /**
* is decoder based fallback is enabled or not
*/
   private boolean isDecoderBasedFallBackEnabled;
@@ -77,6 +82,7 @@ public class EncodedBlocklet {
 this.executorService = executorService;
 this.isDecoderBasedFallBackEnabled = isDecoderBasedFallBackEnabled;
 this.localDictionaryG

[carbondata] 12/22: [CARBONDATA-3345]A growing streaming ROW_V1 carbondata file would be ingored some InputSplits

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f80a28dd9fee8c5d355b30d4e422b854a981b796
Author: junyan-zg <275620...@qq.com>
AuthorDate: Wed Apr 24 22:46:51 2019 +0800

[CARBONDATA-3345]A growing streaming ROW_V1 carbondata file would be 
ingored some InputSplits

After looking at carbondata segments, when the file grows to more than 150 
M (possibly 128M),
Presto initiates a query by separating several small files, including those 
in ROW_V1 format.
This bug causes some small files in ROW_V1 format to be ignored, resulting 
in inaccurate queries.
So for the carbondata ROW_V1 inputSplits MapKey(Java), I adjust concat 
'carbonInput.getStart()' to keeping the required inputSplit

This closes #3186
---
 .../org/apache/carbondata/presto/impl/CarbonTableReader.java | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git 
a/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
 
b/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
index 57d8d5e..7ffe053 100755
--- 
a/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
+++ 
b/integration/presto/src/main/java/org/apache/carbondata/presto/impl/CarbonTableReader.java
@@ -46,6 +46,7 @@ import 
org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.metadata.schema.table.TableInfo;
 import org.apache.carbondata.core.reader.ThriftReader;
 import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.statusmanager.FileFormat;
 import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
 import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
 import org.apache.carbondata.core.util.CarbonProperties;
@@ -291,7 +292,13 @@ public class CarbonTableReader {
 // Use block distribution
 List> inputSplits = new ArrayList(
 result.stream().map(x -> (CarbonLocalInputSplit) 
x).collect(Collectors.groupingBy(
-carbonInput -> 
carbonInput.getSegmentId().concat(carbonInput.getPath(.values());
+carbonInput -> {
+  if (FileFormat.ROW_V1.equals(carbonInput.getFileFormat())) {
+return 
carbonInput.getSegmentId().concat(carbonInput.getPath())
+  .concat(carbonInput.getStart() + "");
+  }
+  return 
carbonInput.getSegmentId().concat(carbonInput.getPath());
+})).values());
 if (inputSplits != null) {
   for (int j = 0; j < inputSplits.size(); j++) {
 multiBlockSplitList.add(new 
CarbonLocalMultiBlockSplit(inputSplits.get(j),



[carbondata] 05/22: [CARBONDATA-3353 ]Fixed MinMax Based Pruning for Measure column in case of Legacy store

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 743b843676e8736beaf55286df3369772224bb90
Author: Indhumathi27 
AuthorDate: Fri Apr 12 12:25:38 2019 +0530

[CARBONDATA-3353 ]Fixed MinMax Based Pruning for Measure column in case of 
Legacy store

Why this PR needed?

Problem:
For table created and loaded with legacy store having a measure column, 
while building the page min max,
min is written as max and viceversa, so blocklet level minmax is wrong. 
With current version, when we query with filter on measure column, measure 
filter pruning is skipping some blocks and giving wrong results.

Solution:
Skip MinMax based pruning in case of legacy store for measure column.

This closes #3176
---
 .../indexstore/blockletindex/BlockDataMap.java | 15 +--
 .../scan/executor/impl/AbstractQueryExecutor.java  | 15 +++
 .../carbondata/core/scan/filter/FilterUtil.java|  8 
 .../filter/executer/IncludeFilterExecuterImpl.java | 11 +++--
 .../RowLevelRangeGrtThanFiterExecuterImpl.java | 10 +++--
 ...LevelRangeGrtrThanEquaToFilterExecuterImpl.java | 10 +++--
 ...wLevelRangeLessThanEqualFilterExecuterImpl.java | 10 +++--
 .../RowLevelRangeLessThanFilterExecuterImpl.java   | 10 +++--
 .../apache/carbondata/core/util/CarbonUtil.java| 47 --
 .../carbondata/core/util/CarbonUtilTest.java   | 46 -
 10 files changed, 51 insertions(+), 131 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index 5b2132c..1fc5831 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -67,7 +67,6 @@ import 
org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 import org.apache.carbondata.core.scan.model.QueryModel;
 import org.apache.carbondata.core.util.BlockletDataMapUtil;
 import org.apache.carbondata.core.util.ByteUtil;
-import org.apache.carbondata.core.util.CarbonUtil;
 import org.apache.carbondata.core.util.DataFileFooterConverter;
 import org.apache.carbondata.core.util.path.CarbonTablePath;
 
@@ -219,7 +218,7 @@ public class BlockDataMap extends CoarseGrainDataMap
 DataMapRowImpl summaryRow = null;
 CarbonRowSchema[] schema = getFileFooterEntrySchema();
 boolean[] minMaxFlag = new 
boolean[segmentProperties.getColumnsValueSize().length];
-Arrays.fill(minMaxFlag, true);
+FilterUtil.setMinMaxFlagForLegacyStore(minMaxFlag, segmentProperties);
 long totalRowCount = 0;
 for (DataFileFooter fileFooter : indexInfo) {
   TableBlockInfo blockInfo = fileFooter.getBlockInfo().getTableBlockInfo();
@@ -232,19 +231,9 @@ public class BlockDataMap extends CoarseGrainDataMap
   if (null != blockMetaInfo) {
 BlockletIndex blockletIndex = fileFooter.getBlockletIndex();
 BlockletMinMaxIndex minMaxIndex = blockletIndex.getMinMaxIndex();
-byte[][] minValues =
-BlockletDataMapUtil.updateMinValues(segmentProperties, 
minMaxIndex.getMinValues());
-byte[][] maxValues =
-BlockletDataMapUtil.updateMaxValues(segmentProperties, 
minMaxIndex.getMaxValues());
-// update min max values in case of old store for measures as measure 
min/max in
-// old stores in written opposite
-byte[][] updatedMinValues =
-CarbonUtil.updateMinMaxValues(fileFooter, maxValues, minValues, 
true);
-byte[][] updatedMaxValues =
-CarbonUtil.updateMinMaxValues(fileFooter, maxValues, minValues, 
false);
 summaryRow = loadToUnsafeBlock(schema, taskSummarySchema, fileFooter, 
segmentProperties,
 getMinMaxCacheColumns(), blockInfo.getFilePath(), summaryRow,
-blockMetaInfo, updatedMinValues, updatedMaxValues, minMaxFlag);
+blockMetaInfo, minMaxIndex.getMinValues(), 
minMaxIndex.getMaxValues(), minMaxFlag);
 totalRowCount += fileFooter.getNumberOfRows();
   }
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
index f81a3dc..b15bdb5 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
@@ -238,6 +238,12 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
   LOGGER.warn("Skipping Direct Vector Filling as it is not Supported "
   + "for Legacy store prior to V3 store");
   queryModel.

[carbondata] 08/22: [CARBONDATA-3348] Support alter SORT_COLUMNS property

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d1b455f09590b48a4ba3709fa29635a18da1d790
Author: QiangCai 
AuthorDate: Tue Apr 16 20:27:31 2019 +0800

[CARBONDATA-3348] Support alter SORT_COLUMNS property

Modification

support alter SORT_COLUMNS
alter table  set tblproperties('sort_scope'='', 'sort_columns'='[c1][,...cn ]')
Limitation

when a measure become a dimension and the query contain this column, the 
task distribution of this query will only support block and blocklet, but not 
merge_small_files or custom.

This closes #3178
---
 .../core/constants/CarbonCommonConstants.java  |   5 +
 .../carbondata/core/datamap/DataMapFilter.java |  89 
 .../carbondata/core/datamap/TableDataMap.java  |  91 ++--
 .../datamap/dev/expr/DataMapExprWrapperImpl.java   |   3 +-
 .../core/metadata/schema/table/CarbonTable.java|  20 +
 .../core/metadata/schema/table/TableInfo.java  |  23 +
 .../scan/executor/impl/AbstractQueryExecutor.java  |  62 +--
 .../executor/impl/QueryExecutorProperties.java |   5 -
 .../core/scan/executor/util/RestructureUtil.java   |  75 ++-
 .../core/scan/model/QueryModelBuilder.java |   2 +-
 .../scan/executor/util/RestructureUtilTest.java|  11 +-
 .../carbondata/hadoop/api/CarbonInputFormat.java   |  29 +-
 .../test/resources/sort_columns/alldatatype1.csv   |  13 +
 .../test/resources/sort_columns/alldatatype2.csv   |  13 +
 .../TestAlterTableSortColumnsProperty.scala| 541 +
 .../carbondata/spark/rdd/CarbonScanRDD.scala   |  10 +-
 .../apache/carbondata/spark/util/CommonUtil.scala  |  80 ++-
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala|  31 +-
 .../org/apache/spark/util/AlterTableUtil.scala | 126 -
 19 files changed, 1039 insertions(+), 190 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index c9efc34..608b5fb 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -478,6 +478,11 @@ public final class CarbonCommonConstants {
*/
   public static final String CACHE_LEVEL_DEFAULT_VALUE = "BLOCK";
 
+  /**
+   * column level property: the measure is changed to the dimension
+   */
+  public static final String COLUMN_DRIFT = "column_drift";
+
   
//
   // Data loading parameter start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
new file mode 100644
index 000..c20d0d5
--- /dev/null
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
@@ -0,0 +1,89 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datamap;
+
+import java.io.Serializable;
+
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.executor.util.RestructureUtil;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+
+/**
+ * the filter of DataMap
+ */
+public class DataMapFilter implements Serializable {
+
+  private CarbonTable table;
+
+  private Expression expression;
+
+  private FilterResolverIntf resolver;
+
+  public DataMapFilter(CarbonTable table, Expression expression) {
+this.table = table;
+this.expression = expression;
+resolve();
+  }
+
+  public DataMapFilter(FilterResolverIntf resolver) {
+this.resolver = resolver;
+  }
+
+  private void resolve() {
+if (expression != null) {
+  table.processFilterExpression(expression,

[carbondata] 09/22: [CARBONDATA-3353 ][HOTFIX]Fixed MinMax Based Pruning for Measure column in case of Legacy store

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9a9c79187a647f6f0acd16f01dd22bd8d6d2f368
Author: Indhumathi27 
AuthorDate: Wed Apr 24 20:52:03 2019 +0530

[CARBONDATA-3353 ][HOTFIX]Fixed MinMax Based Pruning for Measure column in 
case of Legacy store

This closes #3187
---
 .../carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
index 33a337b..64dc3a1 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
@@ -524,8 +524,8 @@ public class IncludeFilterExecuterImpl implements 
FilterExecuter {
   isMinMaxSet[chunkIndex]);
   }
 } else if (isMeasurePresentInCurrentBlock) {
+  chunkIndex = msrColumnEvaluatorInfo.getColumnIndexInMinMaxByteArray();
   if (isMinMaxSet[chunkIndex]) {
-chunkIndex = msrColumnEvaluatorInfo.getColumnIndexInMinMaxByteArray();
 isScanRequired = isScanRequired(blkMaxVal[chunkIndex], 
blkMinVal[chunkIndex],
 msrColumnExecutorInfo.getFilterKeys(), 
msrColumnEvaluatorInfo.getType());
   } else {



[carbondata] 10/22: [HOTFIX] support compact segments with different sort_columns

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9f23d2c1aeabbae0d9b53899e2f91d3ccdb9
Author: QiangCai 
AuthorDate: Thu Apr 25 19:08:49 2019 +0800

[HOTFIX] support compact segments with different sort_columns

This closes #3190
---
 .../core/scan/executor/util/RestructureUtil.java   |  2 +-
 .../merger/CarbonCompactionExecutor.java   |  3 +-
 .../processing/merger/CarbonCompactionUtil.java| 39 --
 3 files changed, 38 insertions(+), 6 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
index 11b7372..0f93227 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/util/RestructureUtil.java
@@ -160,7 +160,7 @@ public class RestructureUtil {
* @param tableColumn
* @return
*/
-  private static boolean isColumnMatches(boolean isTransactionalTable,
+  public static boolean isColumnMatches(boolean isTransactionalTable,
   CarbonColumn queryColumn, CarbonColumn tableColumn) {
 // If it is non transactional table just check the column names, no need 
to validate
 // column id as multiple sdk's output placed in a single folder doesn't 
have same
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
 
b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
index 5961cd7..619b45a 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionExecutor.java
@@ -136,8 +136,7 @@ public class CarbonCompactionExecutor {
   Set taskBlockListMapping = taskBlockInfo.getTaskSet();
   // Check if block needs sorting or not
   boolean sortingRequired =
-  CarbonCompactionUtil.isRestructured(listMetadata, 
carbonTable.getTableLastUpdatedTime())
-  || !CarbonCompactionUtil.isSorted(listMetadata.get(0));
+  !CarbonCompactionUtil.isSortedByCurrentSortColumns(carbonTable, 
listMetadata.get(0));
   for (String task : taskBlockListMapping) {
 tableBlockInfos = taskBlockInfo.getTableBlockInfoList(task);
 // during update there may be a chance that the cardinality may change 
within the segment
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
 
b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
index efd2559..c4b6843 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/merger/CarbonCompactionUtil.java
@@ -35,6 +35,7 @@ import 
org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
 import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema;
+import org.apache.carbondata.core.scan.executor.util.RestructureUtil;
 import org.apache.carbondata.core.util.CarbonUtil;
 import org.apache.carbondata.core.util.path.CarbonTablePath;
 
@@ -464,12 +465,44 @@ public class CarbonCompactionUtil {
* Returns if the DataFileFooter containing carbondata file contains
* sorted data or not.
*
+   * @param table
* @param footer
* @return
-   * @throws IOException
*/
-  public static boolean isSorted(DataFileFooter footer) throws IOException {
-return footer.isSorted();
+  public static boolean isSortedByCurrentSortColumns(CarbonTable table, 
DataFileFooter footer) {
+if (footer.isSorted()) {
+  // When sort_columns is modified, it will be consider as no_sort also.
+  List sortColumnsOfSegment = new ArrayList<>();
+  for (ColumnSchema column : footer.getColumnInTable()) {
+if (column.isDimensionColumn() && column.isSortColumn()) {
+  sortColumnsOfSegment.add(new CarbonDimension(column, -1, -1, -1));
+}
+  }
+  if (sortColumnsOfSegment.size() < table.getNumberOfSortColumns()) {
+return false;
+  }
+  List sortColumnsOfTable = new ArrayList<>();
+  for (CarbonDimension dimension : table.getDimensions()) {
+if (dimension.isSortColumn()) {
+  sortColumnsOfTable.add(dimension);
+}
+  }
+  int sortColumnNums = sortColumnsOfTable.size();
+  if (sortColumnsOfSegment.size() < sortColumnNums) {
+return false;
+  }
+  // compare sort_columns
+  for (int i = 0; i < sortColumnNums; i

[carbondata] 06/22: [CARBONDATA-3344] Fix Drop column not present in table

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 4f7f17d53e768554ec519a70394546d01e585c8b
Author: Indhumathi27 
AuthorDate: Sat Apr 6 16:53:56 2019 +0530

[CARBONDATA-3344] Fix Drop column not present in table

Why this PR needed?
When trying to drop column which is not present in main table, it is 
throwing Null pointer exception, instead of throwing exception for column does 
not exists in table.

This closes #3174
---
 .../cluster/sdv/generated/AlterTableTestCase.scala | 12 
 .../command/schema/CarbonAlterTableDropColumnCommand.scala | 14 +-
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git 
a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
 
b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
index d15f70b..297ff04 100644
--- 
a/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
+++ 
b/integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/AlterTableTestCase.scala
@@ -29,6 +29,8 @@ import 
org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.util.CarbonProperties
 import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 
+import org.apache.carbondata.spark.exception.ProcessMetaDataException
+
 /**
  * Test Class for AlterTableTestCase to verify all scenerios
  */
@@ -1024,6 +1026,16 @@ class AlterTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 }
   }
 
+  test("Test drop columns not present in the table") {
+sql("drop table if exists test1")
+sql("create table test1(col1 int) stored by 'carbondata'")
+val exception = intercept[ProcessMetaDataException] {
+  sql("alter table test1 drop columns(name)")
+}
+assert(exception.getMessage.contains("Column name does not exists in the 
table default.test1"))
+sql("drop table if exists test1")
+  }
+
   val prop = CarbonProperties.getInstance()
   val p1 = prop.getProperty("carbon.horizontal.compaction.enable", 
CarbonCommonConstants.CARBON_HORIZONTAL_COMPACTION_ENABLE_DEFAULT)
   val p2 = prop.getProperty("carbon.horizontal.update.compaction.threshold", 
CarbonCommonConstants.DEFAULT_UPDATE_DELTAFILE_COUNT_THRESHOLD_IUD_COMPACTION)
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDropColumnCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDropColumnCommand.scala
index 7d5cb41..31cfdaf 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDropColumnCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDropColumnCommand.scala
@@ -76,15 +76,6 @@ private[sql] case class CarbonAlterTableDropColumnCommand(
 }
   }
 
-  // Check if column to be dropped is of complex dataType
-  alterTableDropColumnModel.columns.foreach { column =>
-if (carbonTable.getColumnByName(alterTableDropColumnModel.tableName, 
column).getDataType
-  .isComplexType) {
-  val errMsg = "Complex column cannot be dropped"
-  throw new MalformedCarbonCommandException(errMsg)
-}
-  }
-
   val tableColumns = carbonTable.getCreateOrderColumn(tableName).asScala
   var dictionaryColumns = 
Seq[org.apache.carbondata.core.metadata.schema.table.column
   .ColumnSchema]()
@@ -99,6 +90,11 @@ private[sql] case class CarbonAlterTableDropColumnCommand(
 dictionaryColumns ++= Seq(tableColumn.getColumnSchema)
   }
 }
+// Check if column to be dropped is of complex dataType
+if (tableColumn.getDataType.isComplexType) {
+  val errMsg = "Complex column cannot be dropped"
+  throw new MalformedCarbonCommandException(errMsg)
+}
 columnExist = true
   }
 }



[carbondata] 20/22: [CARBONDATA-3374] Optimize documentation and fix some spell errors.

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit b42f1acfb056e9a1cfc5e19b9d652e4af4848aa6
Author: xubo245 
AuthorDate: Tue May 7 20:47:13 2019 +0800

[CARBONDATA-3374] Optimize documentation and fix some spell errors.

Optimize documentation and fix some spell errors.

This closes #3207
---
 .../apache/carbondata/core/datamap/dev/DataMapFactory.java   |  4 ++--
 .../carbondata/core/indexstore/BlockletDetailsFetcher.java   |  4 ++--
 .../indexstore/blockletindex/BlockletDataMapFactory.java |  2 +-
 .../carbondata/core/indexstore/schema/SchemaGenerator.java   |  2 +-
 .../apache/carbondata/core/util/path/CarbonTablePath.java|  2 +-
 .../carbondata/datamap/lucene/LuceneDataMapFactoryBase.java  |  2 +-
 .../datamap/lucene/LuceneFineGrainDataMapFactory.java|  2 +-
 docs/carbon-as-spark-datasource-guide.md |  2 +-
 docs/ddl-of-carbondata.md| 12 +++-
 .../spark/testsuite/dataload/TestLoadDataGeneral.scala   |  4 ++--
 .../spark/testsuite/datamap/CGDataMapTestCase.scala  |  4 ++--
 .../spark/testsuite/datamap/DataMapWriterSuite.scala |  2 +-
 .../spark/testsuite/datamap/FGDataMapTestCase.scala  |  4 ++--
 .../apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala   |  2 +-
 .../org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala   |  6 +++---
 .../execution/datasources/SparkCarbonFileFormat.scala|  3 ++-
 .../scala/org/apache/spark/sql/CarbonCatalystOperators.scala |  4 ++--
 .../execution/command/management/CarbonLoadDataCommand.scala |  2 +-
 .../scala/org/apache/spark/sql/optimizer/CarbonFilters.scala |  2 +-
 19 files changed, 34 insertions(+), 31 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
index ee7914d..b32a482 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
@@ -88,7 +88,7 @@ public abstract class DataMapFactory {
   }
 
   /**
-   * Get the datamap for segmentid
+   * Get the datamap for segmentId
*/
   public abstract List getDataMaps(Segment segment) throws IOException;
 
@@ -99,7 +99,7 @@ public abstract class DataMapFactory {
   throws IOException;
 
   /**
-   * Get all distributable objects of a segmentid
+   * Get all distributable objects of a segmentId
* @return
*/
   public abstract List toDistributable(Segment segment);
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
index 1971f40..ae01e9e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
@@ -27,7 +27,7 @@ import org.apache.carbondata.core.datamap.Segment;
 public interface BlockletDetailsFetcher {
 
   /**
-   * Get the blocklet detail information based on blockletid, blockid and 
segmentid.
+   * Get the blocklet detail information based on blockletid, blockid and 
segmentId.
*
* @param blocklets
* @param segment
@@ -38,7 +38,7 @@ public interface BlockletDetailsFetcher {
   throws IOException;
 
   /**
-   * Get the blocklet detail information based on blockletid, blockid and 
segmentid.
+   * Get the blocklet detail information based on blockletid, blockid and 
segmentId.
*
* @param blocklet
* @param segment
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index 2ef7b88..93be06e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -185,7 +185,7 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
   }
 
   /**
-   * Get the blocklet detail information based on blockletid, blockid and 
segmentid. This method is
+   * Get the blocklet detail information based on blockletid, blockid and 
segmentId. This method is
* exclusively for BlockletDataMapFactory as detail information is only 
available in this
* default datamap.
*/
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.java
index 41c382b..288e062 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/schema/SchemaGenerator.j

[carbondata] 13/22: [CARBONDATA-3343] Compaction for Range Sort

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 7449c346499cc3d454317e5b223c22bb034358a6
Author: manishnalla1994 
AuthorDate: Mon Apr 22 18:52:45 2019 +0530

[CARBONDATA-3343] Compaction for Range Sort

Problem: To support Compaction for Range Sort in correct way as earlier it 
was grouping the ranges/partitions based on taskId which was not correct.

Solution: Combine all the data and create new ranges using Spark's 
RangePartitioner and using them give each
range to one task and apply the filter query to get the compacted segment.

This closes #3182
---
 .../core/constants/CarbonCommonConstants.java  |   1 +
 .../core/metadata/schema/table/CarbonTable.java|  24 +-
 .../core/scan/expression/Expression.java   |  13 +
 .../scan/filter/FilterExpressionProcessor.java |   5 +-
 .../carbondata/core/scan/filter/FilterUtil.java|  52 +-
 .../resolver/ConditionalFilterResolverImpl.java|   2 +-
 .../resolver/RowLevelRangeFilterResolverImpl.java  |  40 +-
 .../core/scan/model/QueryModelBuilder.java |  18 +-
 .../core/scan/result/BlockletScannedResult.java|  62 +-
 .../scan/result/impl/FilterQueryScannedResult.java |  20 +-
 .../result/impl/NonFilterQueryScannedResult.java   |  59 +-
 .../dataload/TestRangeColumnDataLoad.scala | 669 -
 .../spark/load/DataLoadProcessBuilderOnSpark.scala |  43 +-
 .../carbondata/spark/rdd/CarbonMergerRDD.scala | 202 ++-
 .../carbondata/spark/rdd/CarbonScanRDD.scala   |   7 +-
 .../org/apache/spark/CarbonInputMetrics.scala  |   0
 .../apache/spark/DataSkewRangePartitioner.scala|  26 +-
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala|  12 +-
 .../spark/sql/CarbonDatasourceHadoopRelation.scala |   1 -
 .../merger/CarbonCompactionExecutor.java   |  20 +-
 .../processing/merger/CarbonCompactionUtil.java| 140 +
 .../merger/RowResultMergerProcessor.java   |   6 +-
 22 files changed, 1274 insertions(+), 148 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 608b5fb..ba8e20a 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1759,6 +1759,7 @@ public final class CarbonCommonConstants {
   public static final String ARRAY = "array";
   public static final String STRUCT = "struct";
   public static final String MAP = "map";
+  public static final String DECIMAL = "decimal";
   public static final String FROM = "from";
 
   /**
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
index 54ea772..c66d1fc 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
@@ -1081,22 +1081,26 @@ public class CarbonTable implements Serializable {
 return dataSize + indexSize;
   }
 
-  public void processFilterExpression(Expression filterExpression,
-  boolean[] isFilterDimensions, boolean[] isFilterMeasures) {
-QueryModel.FilterProcessVO processVO =
-new QueryModel.FilterProcessVO(getDimensionByTableName(getTableName()),
-getMeasureByTableName(getTableName()), 
getImplicitDimensionByTableName(getTableName()));
-QueryModel.processFilterExpression(processVO, filterExpression, 
isFilterDimensions,
-isFilterMeasures, this);
-
+  public void processFilterExpression(Expression filterExpression, boolean[] 
isFilterDimensions,
+  boolean[] isFilterMeasures) {
+processFilterExpressionWithoutRange(filterExpression, isFilterDimensions, 
isFilterMeasures);
 if (null != filterExpression) {
   // Optimize Filter Expression and fit RANGE filters is conditions apply.
-  FilterOptimizer rangeFilterOptimizer =
-  new RangeFilterOptmizer(filterExpression);
+  FilterOptimizer rangeFilterOptimizer = new 
RangeFilterOptmizer(filterExpression);
   rangeFilterOptimizer.optimizeFilter();
 }
   }
 
+  public void processFilterExpressionWithoutRange(Expression filterExpression,
+  boolean[] isFilterDimensions, boolean[] isFilterMeasures) {
+QueryModel.FilterProcessVO processVO =
+new QueryModel.FilterProcessVO(getDimensionByTableName(getTableName()),
+getMeasureByTableName(getTableName()), 
getImplicitDimensionByTableName(getTableName()));
+QueryModel
+.processFilterExpression(processVO, filterExpression, 
isFilterDimensions, isFilterMeasures,
+this);
+  }
+
   /**

[carbondata] 14/22: [CARBONDATA-3360]fix NullPointerException in delete and clean files operation

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit bc80a22ddb40ed09692008114bbe561ba67955f7
Author: akashrn5 
AuthorDate: Fri Apr 26 12:00:41 2019 +0530

[CARBONDATA-3360]fix NullPointerException in delete and clean files 
operation

Problem:
when delete is failed due to hdfs quota exceeded or disk space is full, 
then tableUpdateStatus.write will be present in store.
So after that if clean files operation is done, we were trying to assign 
null to primitive type long, which will throw runtime exception, and .write 
file will not be deleted, since we consider it as invalid file.

Solution:
if .write file is present, then we do not fail clean files, we check for 
max query timeout for tableUpdateStatus.write file and then delete these .write 
files for any clean files operation after that.

This closes #3191
---
 .../carbondata/core/mutate/CarbonUpdateUtil.java   | 48 +++---
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java 
b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
index a632f03..beaf1a0 100644
--- a/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/mutate/CarbonUpdateUtil.java
@@ -673,7 +673,8 @@ public class CarbonUpdateUtil {
   private static boolean compareTimestampsAndDelete(
   CarbonFile invalidFile,
   boolean forceDelete, boolean isUpdateStatusFile) {
-long fileTimestamp = 0L;
+boolean isDeleted = false;
+Long fileTimestamp;
 
 if (isUpdateStatusFile) {
   fileTimestamp = CarbonUpdateUtil.getTimeStampAsLong(invalidFile.getName()
@@ -683,21 +684,40 @@ public class CarbonUpdateUtil {
   
CarbonTablePath.DataFileUtil.getTimeStampFromFileName(invalidFile.getName()));
 }
 
-// if the timestamp of the file is more than the current time by query 
execution timeout.
-// then delete that file.
-if (CarbonUpdateUtil.isMaxQueryTimeoutExceeded(fileTimestamp) || 
forceDelete) {
-  // delete the files.
-  try {
-LOGGER.info("deleting the invalid file : " + invalidFile.getName());
-CarbonUtil.deleteFoldersAndFiles(invalidFile);
-return true;
-  } catch (IOException e) {
-LOGGER.error("error in clean up of compacted files." + e.getMessage(), 
e);
-  } catch (InterruptedException e) {
-LOGGER.error("error in clean up of compacted files." + e.getMessage(), 
e);
+// This check is because, when there are some invalid files like 
tableStatusUpdate.write files
+// present in store [[which can happen during delete or update if the disk 
is full or hdfs quota
+// is finished]] then fileTimestamp will be null, in that case check for 
max query out and
+// delete the .write file after timeout
+if (fileTimestamp == null) {
+  String tableUpdateStatusFilename = invalidFile.getName();
+  if (tableUpdateStatusFilename.endsWith(".write")) {
+long tableUpdateStatusFileTimeStamp = Long.parseLong(
+
CarbonTablePath.DataFileUtil.getTimeStampFromFileName(tableUpdateStatusFilename));
+if (isMaxQueryTimeoutExceeded(tableUpdateStatusFileTimeStamp)) {
+  isDeleted = deleteInvalidFiles(invalidFile);
+}
+  }
+} else {
+  // if the timestamp of the file is more than the current time by query 
execution timeout.
+  // then delete that file.
+  if (CarbonUpdateUtil.isMaxQueryTimeoutExceeded(fileTimestamp) || 
forceDelete) {
+isDeleted = deleteInvalidFiles(invalidFile);
   }
 }
-return false;
+return isDeleted;
+  }
+
+  private static boolean deleteInvalidFiles(CarbonFile invalidFile) {
+boolean isDeleted;
+try {
+  LOGGER.info("deleting the invalid file : " + invalidFile.getName());
+  CarbonUtil.deleteFoldersAndFiles(invalidFile);
+  isDeleted = true;
+} catch (IOException | InterruptedException e) {
+  LOGGER.error("error in clean up of invalid files." + e.getMessage(), e);
+  isDeleted = false;
+}
+return isDeleted;
   }
 
   public static boolean isBlockInvalid(SegmentStatus blockStatus) {



[carbondata] 18/22: [DOC] Update doc for sort_columns modification

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 4d21b6a8cb7fb3f5b70fd862a78e5f0f12bd
Author: QiangCai 
AuthorDate: Mon May 6 10:39:19 2019 +0800

[DOC] Update doc for sort_columns modification

Update doc for sort_columns modification

This closes #3203
---
 docs/ddl-of-carbondata.md | 21 +
 1 file changed, 21 insertions(+)

diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 88615a2..5bc8f10 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -793,6 +793,27 @@ Users can specify which columns to include and exclude for 
local dictionary gene
ALTER TABLE tablename UNSET TBLPROPERTIES('SORT_SCOPE')
```
 
+ - # SORT COLUMNS
+   Example to SET SORT COLUMNS:
+   ```
+   ALTER TABLE tablename SET TBLPROPERTIES('SORT_COLUMNS'='column1')
+   ```
+   After this operation, the new loading will use the new SORT_COLUMNS. 
The user can adjust 
+   the SORT_COLUMNS according to the query, but it will not impact the old 
data directly. So 
+   it will not impact the query performance of the old data segments which 
are not sorted by 
+   new SORT_COLUMNS.  
+   
+   UNSET is not supported, but it can set SORT_COLUMNS to empty string 
instead of using UNSET.
+   ```
+   ALTER TABLE tablename SET TBLPROPERTIES('SORT_COLUMNS'='')
+   ```
+
+   **NOTE:**
+* The future version will enhance "custom" compaction to sort the old 
segment one by one.
+* The streaming table is not supported for SORT_COLUMNS modification.
+* If the inverted index columns are removed from the new SORT_COLUMNS, 
they will not 
+create the inverted index. But the old configuration of INVERTED_INDEX 
will be kept.
+
 ### DROP TABLE
 
   This command is used to delete an existing table.



[carbondata] 22/22: [CARBONDATA-3391] Count star output is wrong when BLOCKLET CACHE is enabled

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit ea1e86cc71a49563c426264447a95085dce2d436
Author: BJangir 
AuthorDate: Thu May 16 14:53:21 2019 +0530

[CARBONDATA-3391] Count star output is wrong when BLOCKLET CACHE is enabled

Wrong Cont(*) value when blocklet cache is enabled
Root cause :- blockletToRowCountMap has key with segmentNo+carbonfile name 
. so when carbonfile has multiple blocklets , blockletToRowCountMap overrites 
existing rowcount.

Solution :- update the existing rowcount if any.

This closes #3225
---
 .../carbondata/core/indexstore/blockletindex/BlockDataMap.java| 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
index 1fc5831..13e612d 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockDataMap.java
@@ -689,7 +689,13 @@ public class BlockDataMap extends CoarseGrainDataMap
   CarbonCommonConstants.DEFAULT_CHARSET_CLASS) + 
CarbonTablePath.getCarbonDataExtension();
   int rowCount = dataMapRow.getInt(ROW_COUNT_INDEX);
   // prepend segment number with the blocklet file path
-  blockletToRowCountMap.put((segment.getSegmentNo() + "," + fileName), 
(long) rowCount);
+  String blockletMapKey = segment.getSegmentNo() + "," + fileName;
+  Long existingCount = blockletToRowCountMap.get(blockletMapKey);
+  if (null != existingCount) {
+blockletToRowCountMap.put(blockletMapKey, (long) rowCount + 
existingCount);
+  } else {
+blockletToRowCountMap.put(blockletMapKey, (long) rowCount);
+  }
 }
 return blockletToRowCountMap;
   }



[carbondata] 21/22: [CARBONDATA-3377] Fix for Null pointer exception in Range Col compaction

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 4abed04bfefe7a24f18ab42fd96d63a617a26596
Author: manishnalla1994 
AuthorDate: Fri May 10 15:43:10 2019 +0530

[CARBONDATA-3377] Fix for Null pointer exception in Range Col compaction

Problem : String Type Column with huge strings and null values fails giving 
NullPointerException when it is a range column and compaction is done.

Solution : Added a check in StringOrdering for null values.

This closes #3212
---
 .../core/constants/CarbonCommonConstants.java  |  4 +++
 .../carbondata/core/util/CarbonProperties.java |  6 
 .../dataload/TestRangeColumnDataLoad.scala | 42 +-
 .../spark/load/DataLoadProcessBuilderOnSpark.scala | 16 ++---
 .../carbondata/spark/rdd/CarbonMergerRDD.scala |  7 ++--
 5 files changed, 67 insertions(+), 8 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index ba8e20a..43544cb 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1193,6 +1193,10 @@ public final class CarbonCommonConstants {
 
   public static final String CARBON_RANGE_COLUMN_SCALE_FACTOR_DEFAULT = "3";
 
+  public static final String CARBON_ENABLE_RANGE_COMPACTION = 
"carbon.enable.range.compaction";
+
+  public static final String CARBON_ENABLE_RANGE_COMPACTION_DEFAULT = "false";
+
   
//
   // Query parameter start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
index 004a51e..e26f3d8 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
@@ -1507,6 +1507,12 @@ public final class CarbonProperties {
 return Boolean.parseBoolean(pushFilters);
   }
 
+  public boolean isRangeCompactionAllowed() {
+String isRangeCompact = 
getProperty(CarbonCommonConstants.CARBON_ENABLE_RANGE_COMPACTION,
+CarbonCommonConstants.CARBON_ENABLE_RANGE_COMPACTION_DEFAULT);
+return Boolean.parseBoolean(isRangeCompact);
+  }
+
   private void validateSortMemorySpillPercentage() {
 String spillPercentageStr = carbonProperties.getProperty(
 CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE,
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
index 5d6730f..165e4f8 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
@@ -610,6 +610,34 @@ class TestRangeColumnDataLoad extends QueryTest with 
BeforeAndAfterEach with Bef
 sql("DROP TABLE IF EXISTS carbon_range_column1")
   }
 
+  test("Test compaction for range_column - STRING Datatype null values") {
+sql("DROP TABLE IF EXISTS carbon_range_column1")
+deleteFile(filePath2)
+createFile(filePath2, 20, 14)
+sql(
+  """
+| CREATE TABLE carbon_range_column1(id INT, name STRING, city STRING, 
age LONG)
+| STORED BY 'org.apache.carbondata.format'
+| TBLPROPERTIES('SORT_SCOPE'='LOCAL_SORT', 'SORT_COLUMNS'='city',
+| 'range_column'='city')
+  """.stripMargin)
+
+sql(s"LOAD DATA LOCAL INPATH '$filePath2' INTO TABLE carbon_range_column1 
" +
+"OPTIONS('BAD_RECORDS_ACTION'='FORCE','HEADER'='false')")
+
+sql(s"LOAD DATA LOCAL INPATH '$filePath2' INTO TABLE carbon_range_column1 
" +
+"OPTIONS('BAD_RECORDS_ACTION'='FORCE','HEADER'='false')")
+
+var res = sql("select * from carbon_range_column1").collect()
+
+sql("ALTER TABLE carbon_range_column1 COMPACT 'MAJOR'")
+
+checkAnswer(sql("select * from carbon_range_column1"), res)
+
+sql("DROP TABLE IF EXISTS carbon_range_column1")
+deleteFile(filePath2)
+  }
+
   test("Test compaction for range_column - STRING Datatype min/max not 
stored") {
 deleteFile(filePath2)
 createFile(filePath2, 1000, 7)
@@ -930,12 +958,24 @@ class TestRangeColumnDataLoad extends QueryTest with 
BeforeAndAfterEach with Bef
 .println(
   100 + "," + "n" + i + "," + "c" + (i % 100

[carbondata] 17/22: [CARBONDATA-3375] [CARBONDATA-3376] Fix GC Overhead limit exceeded issue and partition column as range column issue

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 7e7792e98e0c10e272c3fb1b1ed5821a5193288f
Author: manishnalla1994 
AuthorDate: Wed May 8 18:28:21 2019 +0530

[CARBONDATA-3375] [CARBONDATA-3376] Fix GC Overhead limit exceeded issue 
and partition column as range column issue

Problem1 : When only single data item is present then it will be launched 
as one single task wich results in one executor getting overloaded.

Solution: When only a single range then we divide the splits and give to 
different tasks in order to ensure one executor does not overload.

Problem2 : When the range col is given as partitioned by column then 
compaction is failed because compaction goes to Range Column flow.

Solution: Added a check for Partition Table when range column is present so 
that it goes through the old flow and compaction passes.

This closes #3210
---
 .../dataload/TestRangeColumnDataLoad.scala |  25 +++
 .../carbondata/spark/rdd/CarbonMergerRDD.scala | 168 +
 2 files changed, 131 insertions(+), 62 deletions(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
index ff383f9..5d6730f 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestRangeColumnDataLoad.scala
@@ -187,6 +187,31 @@ class TestRangeColumnDataLoad extends QueryTest with 
BeforeAndAfterEach with Bef
 sql("DROP TABLE IF EXISTS carbon_range_column1")
   }
 
+  test("Test compaction for range_column - Partition Column") {
+sql("DROP TABLE IF EXISTS carbon_range_column1")
+sql(
+  """
+| CREATE TABLE carbon_range_column1(id INT, name STRING, city STRING)
+| PARTITIONED BY (age INT)
+| STORED BY 'org.apache.carbondata.format'
+| TBLPROPERTIES('SORT_SCOPE'='GLOBAL_SORT', 'SORT_COLUMNS'='age, city',
+| 'range_column'='age')
+  """.stripMargin)
+
+sql(s"LOAD DATA LOCAL INPATH '$filePath' INTO TABLE carbon_range_column1 " 
+
+"OPTIONS('GLOBAL_SORT_PARTITIONS'='3')")
+
+sql(s"LOAD DATA LOCAL INPATH '$filePath' INTO TABLE carbon_range_column1 " 
+
+"OPTIONS('GLOBAL_SORT_PARTITIONS'='3')")
+
+var res = sql("select * from carbon_range_column1").collect()
+
+sql("ALTER TABLE carbon_range_column1 COMPACT 'MAJOR'")
+
+checkAnswer(sql("select * from carbon_range_column1"), res)
+sql("DROP TABLE IF EXISTS carbon_range_column1")
+  }
+
   test("Test compaction for range_column - 2 levels") {
 sql("DROP TABLE IF EXISTS carbon_range_column1")
 sql(
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
index e361c14..c143f93 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala
@@ -296,7 +296,11 @@ class CarbonMergerRDD[K, V](
   tablePath, new CarbonTableIdentifier(databaseName, factTableName, 
tableId)
 )
 val carbonTable = carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable
-val rangeColumn = carbonTable.getRangeColumn
+var rangeColumn: CarbonColumn = null
+if 
(!carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable.isHivePartitionTable) {
+  // If the table is not a partition table then only we go for range 
column compaction flow
+  rangeColumn = carbonTable.getRangeColumn
+}
 val dataType: DataType = if (null != rangeColumn) {
   rangeColumn.getDataType
 } else {
@@ -386,6 +390,7 @@ class CarbonMergerRDD[K, V](
 }
 val LOGGER = LogServiceFactory.getLogService(this.getClass.getName)
 var allRanges: Array[Object] = new Array[Object](0)
+var singleRange = false
 if (rangeColumn != null) {
   // To calculate the number of ranges to be made, min 2 ranges/tasks to 
be made in any case
   val numOfPartitions = Math
@@ -400,10 +405,14 @@ class CarbonMergerRDD[K, V](
 dataType)
   // If RangePartitioner does not give ranges in the case when the data is 
skewed with
   // a lot of null records then we take the min/max from footer and set 
them for tasks
-  if (null == allRanges || (allRanges.size == 1 && allRanges(0) == null)) {
+  if (null == allRanges || allRanges.size == 1) {
 allRanges = 
CarbonCompactionUtil.getOverallMinMax(carbonInputSplits.toList.t

[carbondata] 16/22: [CARBONDATA-3371] Fix ArrayIndexOutOfBoundsException of compaction after sort_columns modification

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d8b0ff47754507ed49e4ad859582678f785ddca9
Author: QiangCai 
AuthorDate: Sun May 5 15:22:23 2019 +0800

[CARBONDATA-3371] Fix ArrayIndexOutOfBoundsException of compaction after 
sort_columns modification

Modification:

SegmentPropertiesWrapper should check the column order for different 
segments
Because sort_columns modification can change the column order.
dictionaryColumnChunkIndex of blockExecutionInfo should keep projection 
order for compaction
if column drift happened, it should convert measure to dimension in 
RawResultIterator

This closes #3201
---
 .../block/SegmentPropertiesAndSchemaHolder.java|  13 +--
 .../scan/executor/impl/AbstractQueryExecutor.java  |   6 +-
 .../core/scan/executor/util/QueryUtil.java |   2 +-
 .../iterator/ColumnDriftRawResultIterator.java | 128 +
 .../scan/result/iterator/RawResultIterator.java|  12 +-
 .../core/scan/wrappers/ByteArrayWrapper.java   |   3 +
 .../TestAlterTableSortColumnsProperty.scala|  92 ++-
 .../carbondata/spark/rdd/StreamHandoffRDD.scala|   2 +-
 .../merger/CarbonCompactionExecutor.java   |  20 +++-
 9 files changed, 225 insertions(+), 53 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
index 34ce5d0..f2f2d8c 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
@@ -346,15 +346,9 @@ public class SegmentPropertiesAndSchemaHolder {
   if (obj1 == null || obj2 == null || (obj1.size() != obj2.size())) {
 return false;
   }
-  List clonedObj1 = new ArrayList<>(obj1);
-  List clonedObj2 = new ArrayList<>(obj2);
-  clonedObj1.addAll(obj1);
-  clonedObj2.addAll(obj2);
-  sortList(clonedObj1);
-  sortList(clonedObj2);
   boolean exists = true;
   for (int i = 0; i < obj1.size(); i++) {
-if (!clonedObj1.get(i).equalsWithStrictCheck(clonedObj2.get(i))) {
+if (!obj1.get(i).equalsWithStrictCheck(obj2.get(i))) {
   exists = false;
   break;
 }
@@ -372,11 +366,14 @@ public class SegmentPropertiesAndSchemaHolder {
 
 @Override public int hashCode() {
   int allColumnsHashCode = 0;
+  // check column order
+  StringBuilder builder = new StringBuilder();
   for (ColumnSchema columnSchema: columnsInTable) {
 allColumnsHashCode = allColumnsHashCode + 
columnSchema.strictHashCode();
+builder.append(columnSchema.getColumnUniqueId()).append(",");
   }
   return carbonTable.getAbsoluteTableIdentifier().hashCode() + 
allColumnsHashCode + Arrays
-  .hashCode(columnCardinality);
+  .hashCode(columnCardinality) + builder.toString().hashCode();
 }
 
 public AbsoluteTableIdentifier getTableIdentifier() {
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
index f06f5c3..6c048f3 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
@@ -605,7 +605,7 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 // setting the size of fixed key column (dictionary column)
 blockExecutionInfo
 .setFixedLengthKeySize(getKeySize(projectDimensions, 
segmentProperties));
-Set dictionaryColumnChunkIndex = new HashSet();
+List dictionaryColumnChunkIndex = new ArrayList();
 List noDictionaryColumnChunkIndex = new ArrayList();
 // get the block index to be read from file for query dimension
 // for both dictionary columns and no dictionary columns
@@ -616,7 +616,9 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 dictionaryColumnChunkIndex.toArray(new 
Integer[dictionaryColumnChunkIndex.size()]));
 // need to sort the dictionary column as for all dimension
 // column key will be filled based on key order
-Arrays.sort(queryDictionaryColumnChunkIndexes);
+if (!queryModel.isForcedDetailRawQuery()) {
+  Arrays.sort(queryDictionaryColumnChunkIndexes);
+}
 
blockExecutionInfo.setDictionaryColumnChunkIndex(queryDictionaryColumnChunkIndexes);
 // setting the no dictionary column block indexes
 
blockExecutionInfo.setNoDictionaryColumnChunkIndexes(ArrayUtils.toPrimitive(
diff --git 
a/core/src/

[carbondata] 19/22: [CARBONDATA-3362] Document update for pagesize table property scenario

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 251cbdc20319fc71013b154e2439503dea072d94
Author: ajantha-bhat 
AuthorDate: Tue May 7 14:36:05 2019 +0530

[CARBONDATA-3362] Document update for pagesize table property scenario

Document update for pagesize table property scenario.

This closes #3206
---
 docs/carbon-as-spark-datasource-guide.md | 2 +-
 docs/ddl-of-carbondata.md| 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/docs/carbon-as-spark-datasource-guide.md 
b/docs/carbon-as-spark-datasource-guide.md
index 598acb0..fe46b09 100644
--- a/docs/carbon-as-spark-datasource-guide.md
+++ b/docs/carbon-as-spark-datasource-guide.md
@@ -44,7 +44,7 @@ Now you can create Carbon table using Spark's datasource DDL 
syntax.
 |---|--||
 | table_blocksize | 1024 | Size of blocks to write onto hdfs. For  more 
details, see [Table Block Size 
Configuration](./ddl-of-carbondata.md#table-block-size-configuration). |
 | table_blocklet_size | 64 | Size of blocklet to write. |
-| table_page_size_inmb | 0 | Size of each page in carbon table, if page size 
crosses this value before 32000 rows, page will be cut to that may rows. Helps 
in keep page size to fit cache size |
+| table_page_size_inmb | 0 | Size of each page in carbon table, if page size 
crosses this value before 32000 rows, page will be cut to that many rows. Helps 
in keep page size to fit cache size |
 | local_dictionary_threshold | 1 | Cardinality upto which the local 
dictionary can be generated. For  more details, see [Local Dictionary 
Configuration](./ddl-of-carbondata.md#local-dictionary-configuration). |
 | local_dictionary_enable | false | Enable local dictionary generation. For  
more details, see [Local Dictionary 
Configuration](./ddl-of-carbondata.md#local-dictionary-configuration). |
 | sort_columns | all dimensions are sorted | Columns to include in sort and 
its order of sort. For  more details, see [Sort Columns 
Configuration](./ddl-of-carbondata.md#sort-columns-configuration). |
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 5bc8f10..34eca8d 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -291,6 +291,11 @@ CarbonData DDL statements are documented here,which 
includes:
  If page size crosses this value before 32000 rows, page will be cut to 
that many rows. 
  Helps in keeping page size to fit cpu cache size.
 
+ This property can be configured if the table has string, varchar, binary 
or complex datatype columns.
+ Because for these columns 32000 rows in one page may exceed 1755 MB and 
snappy compression will fail in that scenario.
+ Also if page size is huge, page cannot be fit in CPU cache. 
+ So, configuring smaller values of this property (say 1 MB) can result in 
better use of CPU cache for pages.
+
  Example usage:
  ```
  TBLPROPERTIES ('TABLE_PAGE_SIZE_INMB'='5')



[carbondata] 04/22: [CARBONDATA-3334] fixed multiple segment file issue for partition

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d1bb3a0e1d9da21ba8493f4845f5c404b8eae56b
Author: kunal642 
AuthorDate: Thu Mar 28 14:33:45 2019 +0530

[CARBONDATA-3334] fixed multiple segment file issue for partition

Problem:
During partition load, while writing merge index files the FactTimestamp in 
load model is being changed to current timestamp due to which a new file with 
mergeindex entry is written.

Solution:
Set new timestamp if FactTimestamp in load model is 0L(meaning nothing is 
set).

This closes #3167
---
 .../standardpartition/StandardPartitionTableLoadingTestCase.scala | 8 
 .../sql/execution/command/management/CarbonLoadDataCommand.scala  | 3 ++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
index 059dd2b..bee118a 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableLoadingTestCase.scala
@@ -496,6 +496,13 @@ class StandardPartitionTableLoadingTestCase extends 
QueryTest with BeforeAndAfte
 }
   }
 
+  test("test number of segment files should not be more than 1 per segment") {
+sql("drop table if exists new_par")
+sql("create table new_par(a string) partitioned by ( b int) stored by 
'carbondata'")
+sql("insert into new_par select 'k',1")
+assert(new 
File(s"$storeLocation/new_par/Metadata/segments/").listFiles().size == 1)
+  }
+
 
 
   def restoreData(dblocation: String, tableName: String) = {
@@ -556,6 +563,7 @@ class StandardPartitionTableLoadingTestCase extends 
QueryTest with BeforeAndAfte
 sql("drop table if exists emp1")
 sql("drop table if exists restorepartition")
 sql("drop table if exists casesensitivepartition")
+sql("drop table if exists new_par")
   }
 
 }
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
index 0c8a1df..b4ef1f0 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
@@ -805,6 +805,8 @@ case class CarbonLoadDataCommand(
   }
   if (updateModel.isDefined) {
 carbonLoadModel.setFactTimeStamp(updateModel.get.updatedTimeStamp)
+  } else if (carbonLoadModel.getFactTimeStamp == 0L) {
+carbonLoadModel.setFactTimeStamp(System.currentTimeMillis())
   }
   // Create and ddd the segment to the tablestatus.
   CarbonLoaderUtil.readAndUpdateLoadProgressInTableMeta(carbonLoadModel, 
isOverwriteTable)
@@ -869,7 +871,6 @@ case class CarbonLoadDataCommand(
   }
 }
 try {
-  carbonLoadModel.setFactTimeStamp(System.currentTimeMillis())
   val compactedSegments = new util.ArrayList[String]()
   // Trigger auto compaction
   CarbonDataRDDFactory.handleSegmentMerging(



Jenkins build became unstable: carbondata-master-spark-2.1 » Apache CarbonData :: Spark Common Test #3526

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.1 » Apache CarbonData :: Store SDK #3526

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.1 #3526

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 #1685

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Spark Common Test #1685

2019-05-16 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: carbondata-master-spark-2.2 » Apache CarbonData :: Store SDK #1685

2019-05-16 Thread Apache Jenkins Server
See 




[carbondata] annotated tag apache-carbondata-1.5.4-rc1 created (now bfe1b59)

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.5.4-rc1
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


  at bfe1b59  (tag)
 tagging 7b880bab28798bb7b8d1d5d394bfba2646f74a49 (commit)
 replaces apache-carbondata-1.5.3-rc1
  by ravipesala
  on Fri May 17 08:21:38 2019 +0530

- Log -
[maven-release-plugin] copy for tag apache-carbondata-1.5.4-rc1
---

No new revisions were added by this update.



[carbondata] branch branch-1.5 updated: [maven-release-plugin] prepare release apache-carbondata-1.5.4-rc1

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/branch-1.5 by this push:
 new 7b880ba  [maven-release-plugin] prepare release 
apache-carbondata-1.5.4-rc1
7b880ba is described below

commit 7b880bab28798bb7b8d1d5d394bfba2646f74a49
Author: ravipesala 
AuthorDate: Fri May 17 02:11:47 2019 +0530

[maven-release-plugin] prepare release apache-carbondata-1.5.4-rc1
---
 assembly/pom.xml  | 2 +-
 common/pom.xml| 2 +-
 core/pom.xml  | 2 +-
 datamap/bloom/pom.xml | 2 +-
 datamap/examples/pom.xml  | 2 +-
 datamap/lucene/pom.xml| 2 +-
 datamap/mv/core/pom.xml   | 2 +-
 datamap/mv/plan/pom.xml   | 2 +-
 examples/spark2/pom.xml   | 2 +-
 format/pom.xml| 2 +-
 hadoop/pom.xml| 2 +-
 integration/hive/pom.xml  | 2 +-
 integration/presto/pom.xml| 2 +-
 integration/spark-common-test/pom.xml | 2 +-
 integration/spark-common/pom.xml  | 2 +-
 integration/spark-datasource/pom.xml  | 2 +-
 integration/spark2/pom.xml| 2 +-
 pom.xml   | 4 ++--
 processing/pom.xml| 2 +-
 store/sdk/pom.xml | 2 +-
 streaming/pom.xml | 2 +-
 tools/cli/pom.xml | 2 +-
 22 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index a05cfe6..7206b0d 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../pom.xml
   
 
diff --git a/common/pom.xml b/common/pom.xml
index 0148424..5fa7df8 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../pom.xml
   
 
diff --git a/core/pom.xml b/core/pom.xml
index 2e8f5da..56cfaf5 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../pom.xml
   
 
diff --git a/datamap/bloom/pom.xml b/datamap/bloom/pom.xml
index 1af3b19..fdc2f62 100644
--- a/datamap/bloom/pom.xml
+++ b/datamap/bloom/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../pom.xml
   
 
diff --git a/datamap/examples/pom.xml b/datamap/examples/pom.xml
index dabf4cd..08693f0 100644
--- a/datamap/examples/pom.xml
+++ b/datamap/examples/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../pom.xml
   
 
diff --git a/datamap/lucene/pom.xml b/datamap/lucene/pom.xml
index 627e758..dfd09f6 100644
--- a/datamap/lucene/pom.xml
+++ b/datamap/lucene/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../pom.xml
   
 
diff --git a/datamap/mv/core/pom.xml b/datamap/mv/core/pom.xml
index a4b8c13..9ee517c 100644
--- a/datamap/mv/core/pom.xml
+++ b/datamap/mv/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../../pom.xml
   
 
diff --git a/datamap/mv/plan/pom.xml b/datamap/mv/plan/pom.xml
index feed9e3..4ee274e 100644
--- a/datamap/mv/plan/pom.xml
+++ b/datamap/mv/plan/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../../pom.xml
   
 
diff --git a/examples/spark2/pom.xml b/examples/spark2/pom.xml
index 09606ac..1bc9247 100644
--- a/examples/spark2/pom.xml
+++ b/examples/spark2/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../pom.xml
   
 
diff --git a/format/pom.xml b/format/pom.xml
index c287422..3b4bcee 100644
--- a/format/pom.xml
+++ b/format/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../pom.xml
   
 
diff --git a/hadoop/pom.xml b/hadoop/pom.xml
index 38649f6..6780f07 100644
--- a/hadoop/pom.xml
+++ b/hadoop/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../pom.xml
   
 
diff --git a/integration/hive/pom.xml b/integration/hive/pom.xml
index df0161a..649eec4 100644
--- a/integration/hive/pom.xml
+++ b/integration/hive/pom.xml
@@ -22,7 +22,7 @@
 
 org.apache.carbondata
 carbondata-parent
-1.5.4-SNAPSHOT
+1.5.4
 ../../pom.xml
 
 
diff --git a/integration/presto/pom.xml b/integration/presto/pom.xml
index c74040c..a4a9aba 100644
--- a/integration/presto/pom.xml
+++ b/integration/presto/pom.xml
@@ -22,7

[carbondata] branch branch-1.5 updated: [maven-release-plugin] prepare for next development iteration

2019-05-16 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/branch-1.5 by this push:
 new 32b7471  [maven-release-plugin] prepare for next development iteration
32b7471 is described below

commit 32b74710defbcb80c1a03dc8a62edf0d4079f087
Author: ravipesala 
AuthorDate: Fri May 17 08:21:55 2019 +0530

[maven-release-plugin] prepare for next development iteration
---
 assembly/pom.xml  | 2 +-
 common/pom.xml| 2 +-
 core/pom.xml  | 2 +-
 datamap/bloom/pom.xml | 2 +-
 datamap/examples/pom.xml  | 2 +-
 datamap/lucene/pom.xml| 2 +-
 datamap/mv/core/pom.xml   | 2 +-
 datamap/mv/plan/pom.xml   | 2 +-
 examples/spark2/pom.xml   | 2 +-
 format/pom.xml| 2 +-
 hadoop/pom.xml| 2 +-
 integration/hive/pom.xml  | 2 +-
 integration/presto/pom.xml| 2 +-
 integration/spark-common-test/pom.xml | 2 +-
 integration/spark-common/pom.xml  | 2 +-
 integration/spark-datasource/pom.xml  | 2 +-
 integration/spark2/pom.xml| 2 +-
 pom.xml   | 4 ++--
 processing/pom.xml| 2 +-
 store/sdk/pom.xml | 2 +-
 streaming/pom.xml | 2 +-
 tools/cli/pom.xml | 2 +-
 22 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index 7206b0d..f1586cb 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/pom.xml b/common/pom.xml
index 5fa7df8..4844768 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/core/pom.xml b/core/pom.xml
index 56cfaf5..ec55faf 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/datamap/bloom/pom.xml b/datamap/bloom/pom.xml
index fdc2f62..ab5e29c 100644
--- a/datamap/bloom/pom.xml
+++ b/datamap/bloom/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/examples/pom.xml b/datamap/examples/pom.xml
index 08693f0..0c9d804 100644
--- a/datamap/examples/pom.xml
+++ b/datamap/examples/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/lucene/pom.xml b/datamap/lucene/pom.xml
index dfd09f6..ee06416 100644
--- a/datamap/lucene/pom.xml
+++ b/datamap/lucene/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/mv/core/pom.xml b/datamap/mv/core/pom.xml
index 9ee517c..b92dc0e 100644
--- a/datamap/mv/core/pom.xml
+++ b/datamap/mv/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../../pom.xml
   
 
diff --git a/datamap/mv/plan/pom.xml b/datamap/mv/plan/pom.xml
index 4ee274e..3d18384 100644
--- a/datamap/mv/plan/pom.xml
+++ b/datamap/mv/plan/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../../pom.xml
   
 
diff --git a/examples/spark2/pom.xml b/examples/spark2/pom.xml
index 1bc9247..88a99f6 100644
--- a/examples/spark2/pom.xml
+++ b/examples/spark2/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/format/pom.xml b/format/pom.xml
index 3b4bcee..45875d7 100644
--- a/format/pom.xml
+++ b/format/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/hadoop/pom.xml b/hadoop/pom.xml
index 6780f07..9bfc789 100644
--- a/hadoop/pom.xml
+++ b/hadoop/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/integration/hive/pom.xml b/integration/hive/pom.xml
index 649eec4..a990b44 100644
--- a/integration/hive/pom.xml
+++ b/integration/hive/pom.xml
@@ -22,7 +22,7 @@
 
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
 
 
diff --git a/integration/presto/pom.xml b/integration/presto/pom.xml
index a4a9aba..4dacce1 100644
--- a/integration/presto/pom.xml
+++ b/integration/presto/pom.xml
@@ -22,7 +22,7 @@