[GitHub] carbondata pull request #1332: [CARBONDATA-1456]Regenerate cached hive resul...

2017-09-06 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1332#discussion_r137459712
  
--- Diff: 
integration/spark-common-cluster-test/src/test/scala/org/apache/spark/sql/common/util/QueryTest.scala
 ---
@@ -84,22 +82,34 @@ class QueryTest extends PlanTest with Suite {
 checkAnswer(df, expectedAnswer.collect())
   }
 
-  protected def checkAnswer(carbon: String, hive: String, 
uniqueIdentifier:String): Unit = {
-val path = TestQueryExecutor.hiveresultpath + "/"+uniqueIdentifier
+  protected def checkAnswer(carbon: String, hive: String, 
uniqueIdentifier: String): Unit = {
+val path = TestQueryExecutor.hiveresultpath + "/" + uniqueIdentifier
 if (FileFactory.isFileExist(path, FileFactory.getFileType(path))) {
-  val objinp = new 
ObjectInputStream(FileFactory.getDataInputStream(path, 
FileFactory.getFileType(path)))
+  val objinp = new ObjectInputStream(FileFactory
+.getDataInputStream(path, FileFactory.getFileType(path)))
   val rows = objinp.readObject().asInstanceOf[Array[Row]]
   objinp.close()
-  checkAnswer(sql(carbon), rows)
+  QueryTest.checkAnswer(sql(carbon), rows) match {
+case Some(errorMessage) => {
+  FileFactory.deleteFile(path, FileFactory.getFileType(path))
+  writeAndCheckAnswer(carbon, hive, path)
--- End diff --

i couldn't understand your comment. how it would go to infinite loop?
we are not using recursive call


---


[GitHub] carbondata issue #1326: [CARBONDATA-1024] supported reading float data type ...

2017-09-06 Thread anubhav100
Github user anubhav100 commented on the issue:

https://github.com/apache/carbondata/pull/1326
  
retest  this please


---


[GitHub] carbondata issue #1307: [CARBONDATA-1433] Added Vectorized Reader for Presto...

2017-09-06 Thread chenliang613
Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/1307
  
retest this please


---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/589/



---


[GitHub] carbondata issue #1286: [CARBONDATA-1404] Added Unit test cases for Hive Int...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1286
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/588/



---


[GitHub] carbondata issue #1310: [CARBONDATA-1442] Refactored Partition-Guide.md

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1310
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/587/



---


[GitHub] carbondata pull request #1265: [CARBONDATA-1128] Add encoding for non-dictio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1265#discussion_r137450241
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/partition/EqualToFilterImpl.java
 ---
@@ -50,7 +50,7 @@ public EqualToFilterImpl(EqualToExpression equalTo, 
PartitionInfo partitionInfo)
   literal.getLiteralExpValue().toString(),
   partitionInfo.getColumnSchemaList().get(0).getDataType());
   if (PartitionType.RANGE == partitionInfo.getPartitionType() && value 
instanceof String) {
-value = ByteUtil.toBytes((String)value);
+value = ByteUtil.toBytesForPlainValue((String)value);
--- End diff --

It seems old overloaded method looks good than this name. It is utility 
method so better have a overloaded method


---


[GitHub] carbondata pull request #1334: [CARBONDATA-1451] Removing configuration for ...

2017-09-06 Thread dhatchayani
Github user dhatchayani commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1334#discussion_r137450091
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonV3DataFormatConstants.java
 ---
@@ -61,24 +61,8 @@
   short NUMBER_OF_COLUMN_TO_READ_IN_IO_MIN = 1;
 
   /**
-   * number of rows per blocklet column page
-   */
-  @CarbonProperty
-  String NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE = 
"number.of.rows.per.blocklet.column.page";
-
-  /**
* number of rows per blocklet column page default value
*/
-  String NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT = "32000";
-
-  /**
-   * number of rows per blocklet column page max value
-   */
-  short NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_MAX = 32000;
-
-  /**
-   * number of rows per blocklet column page min value
-   */
-  short NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_MIN = 8000;
+  short NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT = 32000;
--- End diff --

this is not configurable. 32000 is the default constant stored as 
"NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT"


---


[GitHub] carbondata pull request #1265: [CARBONDATA-1128] Add encoding for non-dictio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1265#discussion_r137449915
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanEqualFilterExecuterImpl.java
 ---
@@ -262,10 +259,7 @@ private BitSet 
getFilteredIndexes(DimensionColumnDataChunk dimensionColumnDataCh
   int numerOfRows) {
 byte[] defaultValue = null;
 if 
(dimColEvaluatorInfoList.get(0).getDimension().hasEncoding(Encoding.DIRECT_DICTIONARY))
 {
-  DirectDictionaryGenerator directDictionaryGenerator = 
DirectDictionaryKeyGeneratorFactory
-  .getDirectDictionaryGenerator(
-  dimColEvaluatorInfoList.get(0).getDimension().getDataType());
-  int key = directDictionaryGenerator.generateDirectSurrogateKey(null) 
+ 1;
+  int key = 0;
--- End diff --

I think it is better to get null key from respective interfaces instead of 
hard code. if the null value key changes in future also no effect here


---


[GitHub] carbondata pull request #1265: [CARBONDATA-1128] Add encoding for non-dictio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1265#discussion_r137449951
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeLessThanEqualFilterExecuterImpl.java
 ---
@@ -367,7 +360,7 @@ private BitSet 
setFilterdIndexToBitSet(DimensionColumnDataChunk dimensionColumnD
 BitSet bitSet = new BitSet(numerOfRows);
 byte[][] filterValues = this.filterRangeValues;
 // binary search can only be applied if column is sorted
-if (isNaturalSorted) {
+if (isNaturalSorted && 
!dimensionColumnDataChunk.isNoDicitionaryColumn()) {
--- End diff --

why this check required?


---


[GitHub] carbondata pull request #1265: [CARBONDATA-1128] Add encoding for non-dictio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1265#discussion_r137449599
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeGrtThanFiterExecuterImpl.java
 ---
@@ -366,7 +364,7 @@ private BitSet 
setFilterdIndexToBitSet(DimensionColumnDataChunk dimensionColumnD
 BitSet bitSet = new BitSet(numerOfRows);
 byte[][] filterValues = this.filterRangeValues;
 // binary search can only be applied if column is sorted
-if (isNaturalSorted) {
+if (isNaturalSorted && 
!dimensionColumnDataChunk.isNoDicitionaryColumn()) {
--- End diff --

why this check is required? even no dictionary columns also can be sorted 
right?


---


[GitHub] carbondata issue #1319: [CARBONDATA-1420] Fixed bug for creation of partitio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1319
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/586/



---


[GitHub] carbondata pull request #1337: [CARBONDATA-1445] Fix update fail when carbon...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1337#discussion_r137448549
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/UpdateCarbonTableTestCase.scala
 ---
@@ -448,6 +448,40 @@ class UpdateCarbonTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("DROP TABLE IF EXISTS default.carbon1")
   }
 
+  test("""CARBONDATA-1445 carbon.update.persist.enable=false it will fail 
to update data""") {
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.isPersistEnabled, "false")
+import sqlContext.implicits._
+val df = sqlContext.sparkContext.parallelize(0 to 50)
+  .map(x => ("a", x.toString, (x % 2).toString, x, x.toLong, x * 2))
+  .toDF("stringField1", "stringField2", "stringField3", "intField", 
"longField", "int2Field")
+sql("DROP TABLE IF EXISTS default.study_carbondata ")
+sql(s""" CREATE TABLE IF NOT EXISTS default.study_carbondata (
+   |stringField1  string,
+   |stringField2  string,
+   |stringField3  string,
+   |intField  int,
+   |longField bigint,
+   |int2Field int) STORED BY 
'carbondata'""".stripMargin)
+df.write
+  .format("carbondata")
+  .option("tableName", "study_carbondata")
+  .option("compress", "true")  // just valid when tempCSV is true
+  .option("tempCSV", "false")
+  .option("single_pass", "true")
+  .option("sort_scope", "LOCAL_SORT")
+  .mode(SaveMode.Append)
+  .save()
+sql("""
+  UPDATE default.study_carbondata a
--- End diff --

Please add assert to check the updated value


---


[GitHub] carbondata issue #1337: [CARBONDATA-1445] Fix update fail when carbon.update...

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1337
  
LGTM


---


[GitHub] carbondata pull request #1307: [CARBONDATA-1433] Added Vectorized Reader for...

2017-09-06 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1307#discussion_r137447980
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorizedRecordReader.java
 ---
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.cache.dictionary.Dictionary;
+import org.apache.carbondata.core.datastore.block.TableBlockInfo;
+import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryGenerator;
+import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryKeyGeneratorFactory;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.encoder.Encoding;
+import org.apache.carbondata.core.scan.executor.QueryExecutor;
+import org.apache.carbondata.core.scan.executor.QueryExecutorFactory;
+import 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException;
+import org.apache.carbondata.core.scan.model.QueryDimension;
+import org.apache.carbondata.core.scan.model.QueryMeasure;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import 
org.apache.carbondata.core.scan.result.iterator.AbstractDetailQueryResultIterator;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnarBatch;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.hadoop.AbstractRecordReader;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.CarbonMultiBlockSplit;
+
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.spark.memory.MemoryMode;
+import org.apache.spark.sql.execution.vectorized.ColumnarBatch;
+import org.apache.spark.sql.types.DecimalType;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * A specialized RecordReader that reads into InternalRows or 
ColumnarBatches directly using the
+ * carbondata column APIs and fills the data directly into columns.
+ */
+class CarbonVectorizedRecordReader extends AbstractRecordReader {
--- End diff --

Yes, we could consider the refactor as you suggested in 1.3.0.  
@bhavya411  please add comment for this class , like "TODO: abstract and 
consolidate VectorizedRecordReader into one , and be shared for all integration 
modules"


---


[GitHub] carbondata issue #1316: [CARBONDATA-1412] - Fixed bug related to incorrect b...

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1316
  
I think it is not good to have both `CARBON_TIMESTAMP_MILLIS` and 
`CARBON_TIMESTAMP_DEFAULT_FORMAT` in CarbonCommonConstants. Why not using 
`CARBON_TIMESTAMP_DEFAULT_FORMAT` only and delete `CARBON_TIMESTAMP_MILLIS`?


---


[jira] [Commented] (CARBONDATA-1414) Show Segments raises exception for a Partition Table after Updation.

2017-09-06 Thread Liang Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156478#comment-16156478
 ] 

Liang Chen commented on CARBONDATA-1414:


[~nehabhardwaj]  pull request 1304 has solved this issue, please verify this 
issues based on the latest master.

> Show Segments raises exception for a Partition Table after Updation.
> 
>
> Key: CARBONDATA-1414
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1414
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.2.0
> Environment: spark 2.1
>Reporter: Neha Bhardwaj
>Assignee: Liang Chen
> Attachments: list_partition_table.csv
>
>
> 1. Create Partition Table :
> DROP TABLE IF EXISTS list_partition_table_string;
>  CREATE TABLE list_partition_table_string(shortField SHORT, intField INT, 
> bigintField LONG, doubleField DOUBLE, timestampField TIMESTAMP, decimalField 
> DECIMAL(18,2), dateField DATE, charField CHAR(5), floatField FLOAT, 
> complexData ARRAY ) PARTITIONED BY (stringField STRING) STORED BY 
> 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='LIST', 'LIST_INFO'='Asia, 
> America, Europe', 'DICTIONARY_EXCLUDE'='stringfield');
> 2. Load Data :
> load data inpath 'hdfs://localhost:54310/CSV/list_partition_table.csv' into 
> table list_partition_table_string 
> options('FILEHEADER'='shortfield,intfield,bigintfield,doublefield,stringfield,timestampfield,decimalfield,datefield,charfield,floatfield,complexdata',
>  'COMPLEX_DELIMITER_LEVEL_1'='$','COMPLEX_DELIMITER_LEVEL_2'='#', 
> 'SINGLE_PASS'='TRUE');
> 3. Update Data :
> update list_partition_table_string set (stringfield)=('China') where 
> stringfield = 'Japan' ;
>  update list_partition_table_string set (stringfield)=('Japan') where 
> stringfield > 'Europe' ;
>  update list_partition_table_string set (stringfield)=('Asia') where 
> stringfield < 'Europe' ;
> 4. Compaction :
> ALTER TABLE list_partition_table_string COMPACT 'Minor';
>  Show segments for table list_partition_table_string;
> Expected Output: Segments Must be Displayed.
> Actual Output : Error: java.lang.NullPointerException (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1286: [CARBONDATA-1404] Added Unit test cases for Hive Int...

2017-09-06 Thread PallaviSingh1992
Github user PallaviSingh1992 commented on the issue:

https://github.com/apache/carbondata/pull/1286
  
retest this please


---


[GitHub] carbondata issue #1315: [CARBONDATA-1431] Fixed the parser for including tim...

2017-09-06 Thread PallaviSingh1992
Github user PallaviSingh1992 commented on the issue:

https://github.com/apache/carbondata/pull/1315
  
This check is not required


---


[GitHub] carbondata pull request #1315: [CARBONDATA-1431] Fixed the parser for includ...

2017-09-06 Thread PallaviSingh1992
Github user PallaviSingh1992 closed the pull request at:

https://github.com/apache/carbondata/pull/1315


---


[GitHub] carbondata issue #1310: [CARBONDATA-1442] Refactored Partition-Guide.md

2017-09-06 Thread PallaviSingh1992
Github user PallaviSingh1992 commented on the issue:

https://github.com/apache/carbondata/pull/1310
  
@jackylk I have squashed the commits


---


[GitHub] carbondata issue #1319: [CARBONDATA-1420] Fixed bug for creation of partitio...

2017-09-06 Thread geetikagupta16
Github user geetikagupta16 commented on the issue:

https://github.com/apache/carbondata/pull/1319
  
retest this please


---


[jira] [Closed] (CARBONDATA-1431) Dictionary_Include working incorrectly for date and timestamp data type.

2017-09-06 Thread Sangeeta Gulia (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangeeta Gulia closed CARBONDATA-1431.
--
Resolution: Fixed

> Dictionary_Include working incorrectly for date and timestamp data type.
> 
>
> Key: CARBONDATA-1431
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1431
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql, test
>Affects Versions: 1.2.0
>Reporter: Sangeeta Gulia
>Assignee: Pallavi Singh
>Priority: Minor
> Fix For: 1.2.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When we create a table with date and timestamp data type with 
> DICTIONARY_INCLUDE : 
> Example : 
> CREATE TABLE uniqdata_INCLUDEDICTIONARY2 (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2')
> It should either create the dictionary for date and timestamp field or it 
> should throw an error that "DICTIONARY_INCLUDE" feature is not supported for 
> date and timestamp.
> whereas in the current master branch,  the query executed successfully 
> without throwing any error and neither it created dictionary files for date 
> and timestamp field.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1316: [CARBONDATA-1412] - Fixed bug related to incorrect b...

2017-09-06 Thread SangeetaGulia
Github user SangeetaGulia commented on the issue:

https://github.com/apache/carbondata/pull/1316
  
retest this please


---


[GitHub] carbondata issue #1332: [CARBONDATA-1456]Regenerate cached hive results if c...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1332
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/585/



---


[GitHub] carbondata pull request #1321: [CARBONDATA-1438] Unify the sort column and s...

2017-09-06 Thread chenerlu
Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1321#discussion_r137442307
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestGlobalSortDataLoad.scala
 ---
@@ -318,12 +329,12 @@ class TestGlobalSortDataLoad extends QueryTest with 
BeforeAndAfterEach with Befo
  | charField CHAR(5),
  | floatField FLOAT
  | )
- | STORED BY 'org.apache.carbondata.format'
+ | STORED BY 'org.apache.carbondata.format' 
TBLPROPERTIES('SORT_SCOPE'='GLOBAL_SORT')
""".stripMargin)
 sql(
   s"""
  | LOAD DATA LOCAL INPATH '$path' INTO TABLE 
carbon_globalsort_difftypes
- | OPTIONS('SORT_SCOPE'='GLOBAL_SORT',
+ | OPTIONS(
  | 
'FILEHEADER'='shortField,intField,bigintField,doubleField,stringField,timestampField,decimalField,dateField,charField,floatField')
""".stripMargin)
--- End diff --

ok


---


[GitHub] carbondata pull request #1321: [CARBONDATA-1438] Unify the sort column and s...

2017-09-06 Thread chenerlu
Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1321#discussion_r137441759
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala
 ---
@@ -639,6 +639,23 @@ case class LoadTable(
 val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
 carbonProperty.addProperty("zookeeper.enable.lock", "false")
 val optionsFinal = getFinalOptions(carbonProperty)
+val tableProperties = relation.tableMeta.carbonTable.getTableInfo
+  .getFactTable.getTableProperties
+
+optionsFinal.put("sort_scope", 
tableProperties.getOrDefault("sort_scope",
+
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
+  carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE,
+CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT
+
+optionsFinal.put("batch_sort_size_inmb", 
tableProperties.getOrDefault("batch_sort_size_inmb",
--- End diff --

Yes, this is only needed for batch sort, but I think if users specify this 
parameter in global sort, it is better to ignore it.


---


[GitHub] carbondata pull request #1321: [CARBONDATA-1438] Unify the sort column and s...

2017-09-06 Thread chenerlu
Github user chenerlu commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1321#discussion_r137441815
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala
 ---
@@ -639,6 +639,23 @@ case class LoadTable(
 val carbonProperty: CarbonProperties = CarbonProperties.getInstance()
 carbonProperty.addProperty("zookeeper.enable.lock", "false")
 val optionsFinal = getFinalOptions(carbonProperty)
+val tableProperties = relation.tableMeta.carbonTable.getTableInfo
+  .getFactTable.getTableProperties
+
+optionsFinal.put("sort_scope", 
tableProperties.getOrDefault("sort_scope",
+
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SORT_SCOPE,
+  carbonProperty.getProperty(CarbonCommonConstants.LOAD_SORT_SCOPE,
+CarbonCommonConstants.LOAD_SORT_SCOPE_DEFAULT
+
+optionsFinal.put("batch_sort_size_inmb", 
tableProperties.getOrDefault("batch_sort_size_inmb",
+  
carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_BATCH_SORT_SIZE_INMB,
+
carbonProperty.getProperty(CarbonCommonConstants.LOAD_BATCH_SORT_SIZE_INMB,
+  CarbonCommonConstants.LOAD_BATCH_SORT_SIZE_INMB_DEFAULT
+
+optionsFinal.put("global_sort_partitions", 
tableProperties.getOrDefault("global_sort_partitions",
--- End diff --

Same as batch sort size I think.


---


[GitHub] carbondata issue #1336: [WIP][CARBONDATA-1425] Inappropriate Exception displ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1336
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/584/



---


[GitHub] carbondata pull request #1332: [CARBONDATA-1456]Regenerate cached hive resul...

2017-09-06 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1332#discussion_r137439030
  
--- Diff: 
integration/spark-common-cluster-test/src/test/scala/org/apache/spark/sql/common/util/QueryTest.scala
 ---
@@ -84,22 +82,34 @@ class QueryTest extends PlanTest with Suite {
 checkAnswer(df, expectedAnswer.collect())
   }
 
-  protected def checkAnswer(carbon: String, hive: String, 
uniqueIdentifier:String): Unit = {
-val path = TestQueryExecutor.hiveresultpath + "/"+uniqueIdentifier
+  protected def checkAnswer(carbon: String, hive: String, 
uniqueIdentifier: String): Unit = {
+val path = TestQueryExecutor.hiveresultpath + "/" + uniqueIdentifier
 if (FileFactory.isFileExist(path, FileFactory.getFileType(path))) {
-  val objinp = new 
ObjectInputStream(FileFactory.getDataInputStream(path, 
FileFactory.getFileType(path)))
+  val objinp = new ObjectInputStream(FileFactory
+.getDataInputStream(path, FileFactory.getFileType(path)))
   val rows = objinp.readObject().asInstanceOf[Array[Row]]
   objinp.close()
-  checkAnswer(sql(carbon), rows)
+  QueryTest.checkAnswer(sql(carbon), rows) match {
+case Some(errorMessage) => {
+  FileFactory.deleteFile(path, FileFactory.getFileType(path))
+  writeAndCheckAnswer(carbon, hive, path)
--- End diff --

Doesn't it go to endless loop when test fails?


---


[GitHub] carbondata issue #1310: [CARBONDATA-1442] Refactored Partition-Guide.md

2017-09-06 Thread Ayushi93
Github user Ayushi93 commented on the issue:

https://github.com/apache/carbondata/pull/1310
  
I shall be doing it today by EOD.

On 06-Sep-2017 7:33 PM, "Jacky Li"  wrote:

It seems this PR contains some commit from master. Can you rebase and
squash your commit? @jatin9896 

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
,
or mute
the thread


.



---


[GitHub] carbondata issue #1336: [WIP][CARBONDATA-1425] Inappropriate Exception displ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1336
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/583/



---


[GitHub] carbondata pull request #1283: [WIP] Add carbon encoding example

2017-09-06 Thread chenerlu
Github user chenerlu closed the pull request at:

https://github.com/apache/carbondata/pull/1283


---


[GitHub] carbondata issue #1337: [CARBONDATA-1445] Fix update fail when carbon.update...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1337
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/582/



---


[GitHub] carbondata issue #1337: [CARBONDATA-1445] Fix update fail when carbon.update...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1337
  
retest this please


---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/581/



---


[GitHub] carbondata issue #1337: [CARBONDATA-1445] Fix update fail when carbon.update...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1337
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/580/



---


[GitHub] carbondata issue #1336: [WIP][CARBONDATA-1425] Inappropriate Exception displ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1336
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/579/



---


[GitHub] carbondata issue #1323: [CARBONDATA-1413]Validate for invalid range info in ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1323
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/578/



---


[GitHub] carbondata issue #1332: [CARBONDATA-1456]Regenerate cached hive results if c...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1332
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/577/



---


[GitHub] carbondata issue #1293: [CARBONDATA-1417]Added cluster tests for IUD, batch ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1293
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/576/



---


[GitHub] carbondata issue #1319: [CARBONDATA-1420] Fixed bug for creation of partitio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1319
  
@geetikagupta16 test is failing, please check


---


[GitHub] carbondata issue #1298: [CARBONDATA-1430] Resolved Split Partition Bug When ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1298
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/575/



---


[GitHub] carbondata issue #1298: [CARBONDATA-1430] Resolved Split Partition Bug When ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1298
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/574/



---


[GitHub] carbondata issue #1293: [CARBONDATA-1417]Added cluster tests for IUD, batch ...

2017-09-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1293
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3435/



---


[GitHub] carbondata pull request #1337: Fix update fail when carbon.update.persist.en...

2017-09-06 Thread ravipesala
GitHub user ravipesala opened a pull request:

https://github.com/apache/carbondata/pull/1337

Fix update fail when carbon.update.persist.enable'='false'

The UDF for getting segementid while loading the data is not handled so 
when it needs to reexecute the rdd when persist enable is false it is not 
getting tupleId from carbon

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ravipesala/incubator-carbondata update-fail

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1337.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1337


commit df368b963d38b25a49d12dbbc62b660acae31154
Author: Ravindra Pesala 
Date:   2017-09-06T15:12:34Z

Fix update fail when carbon.update.persist.enable'='false'




---


[GitHub] carbondata issue #1332: [CARBONDATA-1456]Regenerate cached hive results if c...

2017-09-06 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/1332
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3434/



---


[GitHub] carbondata issue #1320: [CARBONDATA-1446] Fixed Bug for error message on inv...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1320
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/573/



---


[GitHub] carbondata pull request #1336: [CARBONDATA-1425] Inappropriate Exception dis...

2017-09-06 Thread mayunSaicmotor
GitHub user mayunSaicmotor opened a pull request:

https://github.com/apache/carbondata/pull/1336

[CARBONDATA-1425] Inappropriate Exception displays while creating a new 
partition with incorrect partition type

change the error content when the range info data mismatch the partition 
field's data type  
the new  showing content as below:

"Data in range info must be the same type with the partition field's type"
  

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mayunSaicmotor/incubator-carbondata 
CARBONDATA-1425

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1336.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1336


commit c3d0debdaff3edeb67b6fb1f01f6dbfdda9a9b8a
Author: mayun 
Date:   2017-09-06T14:52:39Z

fix for  carbondata 1425




---


[GitHub] carbondata issue #427: [CARBONDATA-429]reduce the no of of io operation bein...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/427
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/572/



---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/571/



---


[GitHub] carbondata pull request #1307: [CARBONDATA-1433] Added Vectorized Reader for...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1307#discussion_r137284215
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/CarbonVectorizedRecordReader.java
 ---
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.core.cache.dictionary.Dictionary;
+import org.apache.carbondata.core.datastore.block.TableBlockInfo;
+import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryGenerator;
+import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryKeyGeneratorFactory;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.encoder.Encoding;
+import org.apache.carbondata.core.scan.executor.QueryExecutor;
+import org.apache.carbondata.core.scan.executor.QueryExecutorFactory;
+import 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException;
+import org.apache.carbondata.core.scan.model.QueryDimension;
+import org.apache.carbondata.core.scan.model.QueryMeasure;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import 
org.apache.carbondata.core.scan.result.iterator.AbstractDetailQueryResultIterator;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnarBatch;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.hadoop.AbstractRecordReader;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.CarbonMultiBlockSplit;
+
+import org.apache.hadoop.mapreduce.InputSplit;
+import org.apache.hadoop.mapreduce.TaskAttemptContext;
+import org.apache.spark.memory.MemoryMode;
+import org.apache.spark.sql.execution.vectorized.ColumnarBatch;
+import org.apache.spark.sql.types.DecimalType;
+import org.apache.spark.sql.types.StructField;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * A specialized RecordReader that reads into InternalRows or 
ColumnarBatches directly using the
+ * carbondata column APIs and fills the data directly into columns.
+ */
+class CarbonVectorizedRecordReader extends AbstractRecordReader {
--- End diff --

Should not copy code from spark2 integration module. I think the correct 
way is to move Vector reader outside of spark integration and share with all 
integration modules.


---


[GitHub] carbondata issue #1332: [CARBONDATA-1456]Regenerate cached hive results if c...

2017-09-06 Thread sraghunandan
Github user sraghunandan commented on the issue:

https://github.com/apache/carbondata/pull/1332
  
ok to test


---


[GitHub] carbondata pull request #1307: [CARBONDATA-1433] Added Vectorized Reader for...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1307#discussion_r137283427
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/CarbonTypeUtil.java
 ---
@@ -0,0 +1,34 @@
+package org.apache.carbondata.presto;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+
+import org.apache.spark.sql.types.DataTypes;
--- End diff --

should not depend on spark in presto integration module


---


[GitHub] carbondata pull request #1331: [CARBONDATA-1326][WIP]: Normal/Low priority f...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1331#discussion_r137282962
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/writer/CarbonDataWriterVo.java
 ---
@@ -155,14 +155,14 @@ public void setIsComplexType(boolean[] isComplexType) 
{
* @return the noDictionaryCount
*/
   public int getNoDictionaryCount() {
-return NoDictionaryCount;
+return noDictionaryCount;
   }
 
   /**
-   * @param noDictionaryCount the noDictionaryCount to set
+   * @param no_DictionaryCount the noDictionaryCount to set
*/
-  public void setNoDictionaryCount(int noDictionaryCount) {
-NoDictionaryCount = noDictionaryCount;
+  public void setNoDictionaryCount(int no_DictionaryCount) {
--- End diff --

No need to change this


---


[GitHub] carbondata pull request #1331: [CARBONDATA-1326][WIP]: Normal/Low priority f...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1331#discussion_r137282769
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
 ---
@@ -21,13 +21,7 @@
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.List;
-import java.util.concurrent.Callable;
-import java.util.concurrent.ExecutionException;
-import java.util.concurrent.ExecutorService;
-import java.util.concurrent.Executors;
-import java.util.concurrent.Future;
-import java.util.concurrent.Semaphore;
-import java.util.concurrent.TimeUnit;
+import java.util.concurrent.*;
--- End diff --

do not use *


---


[GitHub] carbondata issue #1293: [CARBONDATA-1417]Added cluster tests for IUD, batch ...

2017-09-06 Thread sraghunandan
Github user sraghunandan commented on the issue:

https://github.com/apache/carbondata/pull/1293
  
ok to test


---


[GitHub] carbondata pull request #1331: [CARBONDATA-1326][WIP]: Normal/Low priority f...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1331#discussion_r137282685
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/sortandgroupby/sortdata/SortParameters.java
 ---
@@ -34,6 +34,8 @@
 
 public class SortParameters implements Serializable {
 
+  private static final long serialVersionUID = 0L;
--- End diff --

serialVersionUID should not be 0L


---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/570/



---


[GitHub] carbondata pull request #1311: [CARBONDATA-1439] Wrong Error message shown f...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1311#discussion_r137280154
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/converter/impl/RowConverterImpl.java
 ---
@@ -158,8 +158,10 @@ public CarbonRow convert(CarbonRow row) throws 
CarbonDataLoadingException {
   if (!logHolder.isLogged() && logHolder.isBadRecordNotAdded()) {
 badRecordLogger.addBadRecordsToBuilder(copy.getData(), 
logHolder.getReason());
 if (badRecordLogger.isDataLoadFail()) {
-  String error = "Data load failed due to bad record: " + 
logHolder.getReason() +
--- End diff --

What is wrong with this message?


---


[GitHub] carbondata pull request #1320: [CARBONDATA-1446] Fixed Bug for error message...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1320#discussion_r137279394
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/util/PartitionUtils.scala
 ---
@@ -80,37 +80,43 @@ object PartitionUtils {
   dateFormatter: SimpleDateFormat): Unit = {
 val columnDataType = 
partitionInfo.getColumnSchemaList.get(0).getDataType
 val index = partitionIdList.indexOf(partitionId)
-if (partitionInfo.getPartitionType == PartitionType.RANGE) {
-  val rangeInfo = partitionInfo.getRangeInfo.asScala.toList
-  val newRangeInfo = partitionId match {
-case 0 => rangeInfo ++ splitInfo
-case _ => rangeInfo.take(index - 1) ++ splitInfo ++
-  rangeInfo.takeRight(rangeInfo.size - index)
+if (index < 0) {
+  throw new IllegalArgumentException("Invalid Partition Id " + 
partitionId +
+"\n Use show partitions table_name to get the list of valid 
partitions")
+}
+else {
--- End diff --

else clause is not needed


---


[GitHub] carbondata issue #1298: [CARBONDATA-1430] Resolved Split Partition Bug When ...

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1298
  
LGTM


---


[GitHub] carbondata pull request #1298: [CARBONDATA-1430] Resolved Split Partition Bu...

2017-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1298


---


[GitHub] carbondata issue #1310: [CARBONDATA-1442] Refactored Partition-Guide.md

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1310
  
It seems this PR contains some commit from master. Can you rebase and 
squash your commit? @jatin9896 


---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/569/



---


[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1297#discussion_r137273402
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/SafeFixLengthColumnPage.java
 ---
@@ -165,9 +165,28 @@ public double getDouble(int rowId) {
 return doubleData[rowId];
   }
 
-  @Override
-  public BigDecimal getDecimal(int rowId) {
-throw new UnsupportedOperationException("invalid data type: " + 
dataType);
+  @Override public BigDecimal getDecimal(int rowId) {
--- End diff --

I think adding converter inside the column page make it a bit complex. I 
think there are two mays to make the code clean:
1. Store BigDecimal array in the ColumnPage and do the conversion to int or 
long when come to encoding part, by implementing a ColumnPageValueConverter
2. Create DecimalColumnPage class and add the conversion logic in 
DecimalColumnPage. This logic is the extra logic for decimal only. If you do 
like this, we can consider to rename `FixLengthColumnPage` to 
`PrimitiveColumnPage` and `VarLengthColumnPage` to `StringColumnPage` and the 
3rd one is `DecimalColumnPage`. The responsibility of each class will be more 
clear.


---


[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1297#discussion_r137270799
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/DefaultEncodingStrategy.java
 ---
@@ -130,7 +146,8 @@ private static DataType fitLongMinMax(long max, long 
min) {
 }
   }
 
-  private static DataType fitMinMax(DataType dataType, Object max, Object 
min) {
+  private static DataType fitMinMax(DataType dataType, Object max, Object 
min,
+  DecimalConverterFactory.DecimalConverterType decimalConverterType) {
--- End diff --

I think it is not good to pass `decimalConverterType` in many functions. It 
makes code complex. It is better to think of a way to encapsulate it.


---


[GitHub] carbondata pull request #1297: [CARBONDATA-1429] Add a value based compressi...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1297#discussion_r137270300
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/DefaultEncodingStrategy.java
 ---
@@ -104,18 +107,31 @@ private ColumnPageEncoder 
createEncoderForMeasure(ColumnPage columnPage) {
   case SHORT:
   case INT:
   case LONG:
-return 
selectCodecByAlgorithmForIntegral(stats).createEncoder(null);
+return selectCodecByAlgorithmForIntegral(stats,
+
DecimalConverterFactory.DecimalConverterType.DECIMAL_LONG).createEncoder(null);
+  case DECIMAL:
+return createEncoderForDecimalDataTypeMeasure(columnPage);
--- End diff --

Rename as others, `selectCodecByAlgorithmForDecimal`. 


---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/568/



---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/567/



---


[GitHub] carbondata pull request #1333: [CARBONDATA-1455]Disable coverage instrumenta...

2017-09-06 Thread sraghunandan
Github user sraghunandan closed the pull request at:

https://github.com/apache/carbondata/pull/1333


---


[GitHub] carbondata issue #1333: [CARBONDATA-1455]Disable coverage instrumentation of...

2017-09-06 Thread sraghunandan
Github user sraghunandan commented on the issue:

https://github.com/apache/carbondata/pull/1333
  
already merged as part of PR #1328 


---


[GitHub] carbondata pull request #1299: [CARBONDATA-1426] Resolved Split Partition Bu...

2017-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1299


---


[GitHub] carbondata issue #1299: [CARBONDATA-1426] Resolved Split Partition Bug When ...

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1299
  
LGTM


---


[GitHub] carbondata pull request #1308: [CARBONDATA-1440]Fix coverity issues

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1308#discussion_r137256155
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/UnsafeFixLengthColumnPage.java
 ---
@@ -223,39 +223,39 @@ public BigDecimal getDecimal(int rowId) {
   @Override
   public int[] getIntPage() {
 int[] data = new int[getPageSize()];
-for (int i = 0; i < data.length; i++) {
+for (long i = 0; i < data.length; i++) {
   long offset = i << intBits;
-  data[i] = CarbonUnsafe.getUnsafe().getInt(baseAddress, baseOffset + 
offset);
+  data[(int)i] = CarbonUnsafe.getUnsafe().getInt(baseAddress, 
baseOffset + offset);
 }
 return data;
   }
 
   @Override
   public long[] getLongPage() {
 long[] data = new long[getPageSize()];
-for (int i = 0; i < data.length; i++) {
+for (long i = 0; i < data.length; i++) {
   long offset = i << longBits;
-  data[i] = CarbonUnsafe.getUnsafe().getLong(baseAddress, baseOffset + 
offset);
+  data[(int)i] = CarbonUnsafe.getUnsafe().getLong(baseAddress, 
baseOffset + offset);
 }
 return data;
   }
 
   @Override
   public float[] getFloatPage() {
 float[] data = new float[getPageSize()];
-for (int i = 0; i < data.length; i++) {
+for (long i = 0; i < data.length; i++) {
--- End diff --

why this modification is needed?


---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/566/



---


[GitHub] carbondata issue #1322: [CARBONDATA-1450] Support timestamp more than 68 yea...

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1322
  
please refer to #1265, which use adaptive encoding for timestamp and date 
column.
I think it is better to solve 68 year issue after #1265 is merged


---


[GitHub] carbondata pull request #1334: [CARBONDATA-1451] Removing configuration for ...

2017-09-06 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1334#discussion_r137255098
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonV3DataFormatConstants.java
 ---
@@ -61,24 +61,8 @@
   short NUMBER_OF_COLUMN_TO_READ_IN_IO_MIN = 1;
 
   /**
-   * number of rows per blocklet column page
-   */
-  @CarbonProperty
-  String NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE = 
"number.of.rows.per.blocklet.column.page";
-
-  /**
* number of rows per blocklet column page default value
*/
-  String NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT = "32000";
-
-  /**
-   * number of rows per blocklet column page max value
-   */
-  short NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_MAX = 32000;
-
-  /**
-   * number of rows per blocklet column page min value
-   */
-  short NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_MIN = 8000;
+  short NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT = 32000;
--- End diff --

I do not think it is configurable, better to remove it from this file


---


[jira] [Resolved] (CARBONDATA-1453) Optimize the cluster test case ID and make it more general

2017-09-06 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-1453.
--
   Resolution: Fixed
Fix Version/s: 1.2.0

> Optimize the cluster test case ID and make it more general
> --
>
> Key: CARBONDATA-1453
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1453
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Raghunandan S
> Fix For: 1.2.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata pull request #1328: [CARBONDATA-1453]Optimize test case IDs

2017-09-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/1328


---


[GitHub] carbondata issue #1328: [CARBONDATA-1453]Optimize test case IDs

2017-09-06 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/1328
  
LGTM


---


[GitHub] carbondata issue #1335: [CARBONDATA-1452] Issue with loading timestamp data ...

2017-09-06 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1335
  
@dhatchayani hi, can you provide detail error message? I cannot understand 
why we should modify this.


---


[GitHub] carbondata issue #1332: [CARBONDATA-1456]Regenerate cached hive results if c...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1332
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/565/



---


[GitHub] carbondata issue #1330: [CARBONDATA-1408]:Data loading with globalSort is fa...

2017-09-06 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/1330
  
hi @kushalsaha, I cannot understand the problem and modification, Can you 
describe the problems in the corresponding issue?


---


[GitHub] carbondata issue #1335: [CARBONDATA-1452] Issue with loading timestamp data ...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1335
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/564/



---


[jira] [Updated] (CARBONDATA-45) Support MAP type

2017-09-06 Thread Sharanabasappa G Keriwaddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharanabasappa G Keriwaddi updated CARBONDATA-45:
-
Fix Version/s: (was: NONE)
   1.3.0
  Component/s: sql
   core

> Support MAP type
> 
>
> Key: CARBONDATA-45
> URL: https://issues.apache.org/jira/browse/CARBONDATA-45
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, sql
>Reporter: cen yuhai
>Assignee: Venkata Ramana G
> Fix For: 1.3.0
>
>
> We have many tables which use map type, and general file format orc and 
> parquet support map type. So can carbondata support map type?
> As sql "select map['id'] from table", orc will read all keys in map type. Can 
> we just read key 'id' ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-45) Support MAP type

2017-09-06 Thread Sharanabasappa G Keriwaddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharanabasappa G Keriwaddi updated CARBONDATA-45:
-
Due Date: 20/Sep/17

> Support MAP type
> 
>
> Key: CARBONDATA-45
> URL: https://issues.apache.org/jira/browse/CARBONDATA-45
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: cen yuhai
>Assignee: Venkata Ramana G
> Fix For: NONE
>
>
> We have many tables which use map type, and general file format orc and 
> parquet support map type. So can carbondata support map type?
> As sql "select map['id'] from table", orc will read all keys in map type. Can 
> we just read key 'id' ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-45) Support MAP type

2017-09-06 Thread Sharanabasappa G Keriwaddi (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharanabasappa G Keriwaddi reassigned CARBONDATA-45:


Assignee: Venkata Ramana G  (was: Vimal Das Kammath)

This Jira will be used as umbrella jira for supporting Map Data Type.

> Support MAP type
> 
>
> Key: CARBONDATA-45
> URL: https://issues.apache.org/jira/browse/CARBONDATA-45
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: cen yuhai
>Assignee: Venkata Ramana G
> Fix For: NONE
>
>
> We have many tables which use map type, and general file format orc and 
> parquet support map type. So can carbondata support map type?
> As sql "select map['id'] from table", orc will read all keys in map type. Can 
> we just read key 'id' ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1457) Stabilize Struct DataType Support

2017-09-06 Thread Sharanabasappa G Keriwaddi (JIRA)
Sharanabasappa G Keriwaddi created CARBONDATA-1457:
--

 Summary: Stabilize Struct DataType Support
 Key: CARBONDATA-1457
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1457
 Project: CarbonData
  Issue Type: New Feature
  Components: core, sql
Affects Versions: 1.3.0
Reporter: Sharanabasappa G Keriwaddi
Assignee: Venkata Ramana G
 Fix For: 1.3.0


Stabilize Struct DataType Support. This is umbrella jira to track all related 
tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1445) if 'carbon.update.persist.enable'='false', it will fail to update data

2017-09-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-1445:
---

Assignee: Ravindra Pesala  (was: Ashwini K)

> if 'carbon.update.persist.enable'='false', it will fail to update data 
> ---
>
> Key: CARBONDATA-1445
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1445
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, spark-integration, sql
>Affects Versions: 1.2.0
> Environment: CarbonData master branch, Spark 2.1.1
>Reporter: Zhichao  Zhang
>Assignee: Ravindra Pesala
>Priority: Minor
>
> When updating data, if set 'carbon.update.persist.enable'='false', it will 
> fail.
> I debug code and find that in the method LoadTable.processData the 
> 'dataFrameWithTupleId' will call udf 'getTupleId()' which is defined in 
> CarbonEnv.init(): 'sparkSession.udf.register("getTupleId", () => "")', it 
> will return blank string to 'CarbonUpdateUtil.getRequiredFieldFromTID', so 
> ArrayIndexOutOfBoundsException occur.
> *the plans (logical and physical) for dataFrameWithTupleId :*
> == Parsed Logical Plan ==
> 'Project [unresolvedalias('stringField3, None), unresolvedalias('intField, 
> None), unresolvedalias('longField, None), unresolvedalias('int2Field, None), 
> unresolvedalias('stringfield1-updatedColumn, None), 
> unresolvedalias('stringfield2-updatedColumn, None), UDF('tupleId) AS 
> segId#286]
> +- Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> UDF:getTupleId() AS tupleId#262, concat(stringField1#111, _test) AS 
> stringfield1-updatedColumn#263, concat(stringField2#112, _test) AS 
> stringfield2-updatedColumn#264]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ]
> == Analyzed Logical Plan ==
> stringField3: string, intField: int, longField: bigint, int2Field: int, 
> stringfield1-updatedColumn: string, stringfield2-updatedColumn: string, 
> segId: string
> Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> stringfield1-updatedColumn#263, stringfield2-updatedColumn#264, 
> UDF(tupleId#262) AS segId#286]
> +- Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> UDF:getTupleId() AS tupleId#262, concat(stringField1#111, _test) AS 
> stringfield1-updatedColumn#263, concat(stringField2#112, _test) AS 
> stringfield2-updatedColumn#264]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ]
> == Optimized Logical Plan ==
> CarbonDictionaryCatalystDecoder [CarbonDecoderRelation(Map(int2Field#116 -> 
> int2Field#116, longField#115L -> longField#115L, stringField2#112 -> 
> stringField2#112, stringField1#111 -> stringField1#111, stringField3#113 -> 
> stringField3#113, intField#114 -> 
> intField#114),CarbonDatasourceHadoopRelation [ Database name :default, Table 
> name :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ])], 
> ExcludeProfile(ArrayBuffer(stringField2#112, stringField1#111)), 
> CarbonAliasDecoderRelation(), true
> +- Project [stringField3#113, intField#114, longField#115, int2Field#116, 
> concat(stringField1#111, _test) AS stringfield1-updatedColumn#263, 
> concat(stringField2#112, _test) AS stringfield2-updatedColumn#264, 
> UDF(UDF:getTupleId()) AS segId#286]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table

[GitHub] carbondata issue #1334: [CARBONDATA-1451] Removing configuration for number_...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1334
  
SDV Build Fail , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/563/



---


[jira] [Commented] (CARBONDATA-1445) if 'carbon.update.persist.enable'='false', it will fail to update data

2017-09-06 Thread Ashwini K (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155211#comment-16155211
 ] 

Ashwini K commented on CARBONDATA-1445:
---

This issue is fixed as a part of JIRA#1293 and PR 
https://github.com/apache/carbondata/pull/1161/

> if 'carbon.update.persist.enable'='false', it will fail to update data 
> ---
>
> Key: CARBONDATA-1445
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1445
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, spark-integration, sql
>Affects Versions: 1.2.0
> Environment: CarbonData master branch, Spark 2.1.1
>Reporter: Zhichao  Zhang
>Assignee: Ashwini K
>Priority: Minor
>
> When updating data, if set 'carbon.update.persist.enable'='false', it will 
> fail.
> I debug code and find that in the method LoadTable.processData the 
> 'dataFrameWithTupleId' will call udf 'getTupleId()' which is defined in 
> CarbonEnv.init(): 'sparkSession.udf.register("getTupleId", () => "")', it 
> will return blank string to 'CarbonUpdateUtil.getRequiredFieldFromTID', so 
> ArrayIndexOutOfBoundsException occur.
> *the plans (logical and physical) for dataFrameWithTupleId :*
> == Parsed Logical Plan ==
> 'Project [unresolvedalias('stringField3, None), unresolvedalias('intField, 
> None), unresolvedalias('longField, None), unresolvedalias('int2Field, None), 
> unresolvedalias('stringfield1-updatedColumn, None), 
> unresolvedalias('stringfield2-updatedColumn, None), UDF('tupleId) AS 
> segId#286]
> +- Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> UDF:getTupleId() AS tupleId#262, concat(stringField1#111, _test) AS 
> stringfield1-updatedColumn#263, concat(stringField2#112, _test) AS 
> stringfield2-updatedColumn#264]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ]
> == Analyzed Logical Plan ==
> stringField3: string, intField: int, longField: bigint, int2Field: int, 
> stringfield1-updatedColumn: string, stringfield2-updatedColumn: string, 
> segId: string
> Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> stringfield1-updatedColumn#263, stringfield2-updatedColumn#264, 
> UDF(tupleId#262) AS segId#286]
> +- Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> UDF:getTupleId() AS tupleId#262, concat(stringField1#111, _test) AS 
> stringfield1-updatedColumn#263, concat(stringField2#112, _test) AS 
> stringfield2-updatedColumn#264]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ]
> == Optimized Logical Plan ==
> CarbonDictionaryCatalystDecoder [CarbonDecoderRelation(Map(int2Field#116 -> 
> int2Field#116, longField#115L -> longField#115L, stringField2#112 -> 
> stringField2#112, stringField1#111 -> stringField1#111, stringField3#113 -> 
> stringField3#113, intField#114 -> 
> intField#114),CarbonDatasourceHadoopRelation [ Database name :default, Table 
> name :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ])], 
> ExcludeProfile(ArrayBuffer(stringField2#112, stringField1#111)), 
> CarbonAliasDecoderRelation(), true
> +- Project [stringField3#113, intField#114, longField#115, int2Field#116, 
> concat(stringField1#111, _test) AS stringfield1-updatedColumn#263, 
> concat(stringField2#112, _test) AS stringfield2-updatedColumn#264, 
> UDF(UDF:getTupleId()) AS segId#286]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#1

[GitHub] carbondata issue #1319: [CARBONDATA-1420] Fixed bug for creation of partitio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1319
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/562/



---


[GitHub] carbondata issue #1319: [CARBONDATA-1420] Fixed bug for creation of partitio...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1319
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/561/



---


[GitHub] carbondata issue #1333: [CARBONDATA-1455]Disable coverage instrumentation of...

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1333
  
SDV Build Success , Please check CI 
http://144.76.159.231:8080/job/ApacheSDVTests/560/



---


[jira] [Created] (CARBONDATA-1456) Regenerate cached hive results if cluster testcases fail

2017-09-06 Thread Raghunandan S (JIRA)
Raghunandan S created CARBONDATA-1456:
-

 Summary: Regenerate cached hive results if cluster testcases fail
 Key: CARBONDATA-1456
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1456
 Project: CarbonData
  Issue Type: Bug
Reporter: Raghunandan S


Hive results are cached to speed up subsequent test runs.But some times the 
result might have to be changed
Reasons:
1.May be the test case changed
2.May be the input data changed
3.May be the environment changed



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] carbondata issue #1296: [CARBONDATA-649] fix for update with rand function

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1296
  
@ashwini-krishnakumar Please amend all comments as single comment. There is 
an issue while merging to master.


---


[GitHub] carbondata issue #1296: [CARBONDATA-649] fix for update with rand function

2017-09-06 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/1296
  
@ashwini-krishnakumar Please amend all comments as single comment. There is 
an issue while merging to master.


---


[GitHub] carbondata issue #1281: [CARBONDATA-1326] Fixed findbug issue and univocity-...

2017-09-06 Thread pawanmalwal
Github user pawanmalwal commented on the issue:

https://github.com/apache/carbondata/pull/1281
  
Yes thats right. This PR also has some code change to avoid 1 more string 
object creation in the same method and another chanage to update 
univocity-parsers jar version to 2.2.1 in processing/pom.xml. Please review.


---


[jira] [Assigned] (CARBONDATA-1445) if 'carbon.update.persist.enable'='false', it will fail to update data

2017-09-06 Thread Ashwini K (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashwini K reassigned CARBONDATA-1445:
-

Assignee: Ashwini K

> if 'carbon.update.persist.enable'='false', it will fail to update data 
> ---
>
> Key: CARBONDATA-1445
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1445
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, spark-integration, sql
>Affects Versions: 1.2.0
> Environment: CarbonData master branch, Spark 2.1.1
>Reporter: Zhichao  Zhang
>Assignee: Ashwini K
>Priority: Minor
>
> When updating data, if set 'carbon.update.persist.enable'='false', it will 
> fail.
> I debug code and find that in the method LoadTable.processData the 
> 'dataFrameWithTupleId' will call udf 'getTupleId()' which is defined in 
> CarbonEnv.init(): 'sparkSession.udf.register("getTupleId", () => "")', it 
> will return blank string to 'CarbonUpdateUtil.getRequiredFieldFromTID', so 
> ArrayIndexOutOfBoundsException occur.
> *the plans (logical and physical) for dataFrameWithTupleId :*
> == Parsed Logical Plan ==
> 'Project [unresolvedalias('stringField3, None), unresolvedalias('intField, 
> None), unresolvedalias('longField, None), unresolvedalias('int2Field, None), 
> unresolvedalias('stringfield1-updatedColumn, None), 
> unresolvedalias('stringfield2-updatedColumn, None), UDF('tupleId) AS 
> segId#286]
> +- Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> UDF:getTupleId() AS tupleId#262, concat(stringField1#111, _test) AS 
> stringfield1-updatedColumn#263, concat(stringField2#112, _test) AS 
> stringfield2-updatedColumn#264]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ]
> == Analyzed Logical Plan ==
> stringField3: string, intField: int, longField: bigint, int2Field: int, 
> stringfield1-updatedColumn: string, stringfield2-updatedColumn: string, 
> segId: string
> Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> stringfield1-updatedColumn#263, stringfield2-updatedColumn#264, 
> UDF(tupleId#262) AS segId#286]
> +- Project [stringField3#113, intField#114, longField#115L, int2Field#116, 
> UDF:getTupleId() AS tupleId#262, concat(stringField1#111, _test) AS 
> stringfield1-updatedColumn#263, concat(stringField2#112, _test) AS 
> stringfield2-updatedColumn#264]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ]
> == Optimized Logical Plan ==
> CarbonDictionaryCatalystDecoder [CarbonDecoderRelation(Map(int2Field#116 -> 
> int2Field#116, longField#115L -> longField#115L, stringField2#112 -> 
> stringField2#112, stringField1#111 -> stringField1#111, stringField3#113 -> 
> stringField3#113, intField#114 -> 
> intField#114),CarbonDatasourceHadoopRelation [ Database name :default, Table 
> name :study_carbondata, Schema 
> :Some(StructType(StructField(stringField1,StringType,true), 
> StructField(stringField2,StringType,true), 
> StructField(stringField3,StringType,true), 
> StructField(intField,IntegerType,true), StructField(longField,LongType,true), 
> StructField(int2Field,IntegerType,true))) ])], 
> ExcludeProfile(ArrayBuffer(stringField2#112, stringField1#111)), 
> CarbonAliasDecoderRelation(), true
> +- Project [stringField3#113, intField#114, longField#115, int2Field#116, 
> concat(stringField1#111, _test) AS stringfield1-updatedColumn#263, 
> concat(stringField2#112, _test) AS stringfield2-updatedColumn#264, 
> UDF(UDF:getTupleId()) AS segId#286]
>+- Filter (isnotnull(stringField3#113) && (stringField3#113 = 1))
>   +- 
> Relation[stringField1#111,stringField2#112,stringField3#113,intField#114,longField#115L,int2Field#116]
>  CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :study_carbondata, Schema 
> :Som

  1   2   >