[GitHub] incubator-carbondata pull request #594: [CARBONDATA-701]Fix memory leak issu...

2017-02-15 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/594#discussion_r101460125
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/Sorter.java
 ---
@@ -39,11 +39,13 @@
* Sorts the data of all iterators, this iterators can be
* read parallely depends on implementation.
*
-   * @param iterators array of iterators to read data.
* @return
* @throws CarbonDataLoadingException
*/
-  Iterator[] sort(Iterator[] iterators)
+  Iterator[] sort()
+  throws CarbonDataLoadingException;
+
+  void prepare(Iterator[] iterators)
--- End diff --

Better to invoke child.close() before final merger


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #594: [CARBONDATA-701]Fix memory leak issu...

2017-02-15 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/594#discussion_r101432953
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/sortandgroupby/sortdata/IntermediateFileMerger.java
 ---
@@ -116,8 +116,15 @@ public IntermediateFileMerger(SortParameters 
mergerParameters, File[] intermedia
   writeDataTofile(next());
 }
   } else {
+int i = 0;
 while (hasNext()) {
+  i++;
   writeDataTofileWithOutKettle(next());
+  if (i % 1 == 0) {
--- End diff --

ok, I will test a better value of defualt buffer size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #594: [CARBONDATA-701]Fix memory leak issu...

2017-02-15 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/594#discussion_r101436969
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/sortandgroupby/sortdata/IntermediateFileMerger.java
 ---
@@ -116,8 +116,15 @@ public IntermediateFileMerger(SortParameters 
mergerParameters, File[] intermedia
   writeDataTofile(next());
 }
   } else {
+int i = 0;
 while (hasNext()) {
+  i++;
   writeDataTofileWithOutKettle(next());
+  if (i % 1 == 0) {
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #594: [CARBONDATA-701]Fix memory leak issu...

2017-02-15 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/594#discussion_r101436963
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/sortandgroupby/sortdata/SortDataRows.java
 ---
@@ -375,6 +376,9 @@ private void writeDataWithOutKettle(Object[][] 
recordHolderList, int entryCountL
 stream.write((byte) 0);
   }
 }
+if (i % 1 == ) {
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #594: [CARBONDATA-701]Fix memory leak issu...

2017-02-16 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/594#discussion_r101462875
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/Sorter.java
 ---
@@ -39,11 +39,13 @@
* Sorts the data of all iterators, this iterators can be
* read parallely depends on implementation.
*
-   * @param iterators array of iterators to read data.
* @return
* @throws CarbonDataLoadingException
*/
-  Iterator[] sort(Iterator[] iterators)
+  Iterator[] sort()
+  throws CarbonDataLoadingException;
+
+  void prepare(Iterator[] iterators)
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #596: [WIP]Test for repository

2017-02-10 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/596
  
Looks good


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #594: [CARBONDATA-701]Fix memory leak issu...

2017-02-15 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/594#discussion_r101432584
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/impl/ParallelReadMergeSorterImpl.java
 ---
@@ -86,11 +88,10 @@ public void initialize(SortParameters sortParameters) {
 sortParameters.getNoDictionaryDimnesionColumn(), 
sortParameters.isUseKettle());
   }
 
-  @Override
-  public Iterator[] sort(Iterator[] 
iterators)
+  public void prepare(Iterator[] iterators)
--- End diff --

Yes, but the method initialize is exists.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #539: [CARBONDATA-659]add WhitespaceAround and Pa...

2017-01-19 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/539
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #528: [CARBONDATA-617]Fix InsertInto test ...

2017-01-14 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/528#discussion_r96129390
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetastore.scala
 ---
@@ -782,7 +782,47 @@ case class CarbonRelation(
 nullable = true)())
   }
 
-  override val output = dimensionsAttr ++ measureAttr
+  override val output = {
+val columns = 
tableMeta.carbonTable.getCreateOrderColumn(tableMeta.carbonTable.getFactTableName)
+  .asScala
+columns.filter(!_.isInvisible).map { column =>
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #528: [CARBONDATA-617]Fix InsertInto test ...

2017-01-14 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/528#discussion_r96129389
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/csvreaderstep/CsvInput.java
 ---
@@ -194,28 +196,25 @@ public boolean processRow(StepMetaInterface smi, 
StepDataInterface sdi) throws K
   }
 
   class RddScanCallable implements Callable {
-List<CarbonIterator<String[]>> iterList;
-
-RddScanCallable() {
-  this.iterList = new ArrayList<CarbonIterator<String[]>>(1000);
-}
-
-public void addJavaRddIterator(CarbonIterator<String[]> iter) {
-  this.iterList.add(iter);
-}
-
-@Override
-public Void call() throws Exception {
-  StandardLogService.setThreadName(("PROCESS_DataFrame_PARTITIONS"),
-  Thread.currentThread().getName());
+@Override public Void call() throws Exception {
+  StandardLogService
+  .setThreadName(("PROCESS_DataFrame_PARTITIONS"), 
Thread.currentThread().getName());
   try {
 String[] values = null;
-for (CarbonIterator<String[]> iter: iterList) {
-  iter.initialize();
-  while (iter.hasNext()) {
-values = iter.next();
-synchronized (putRowLock) {
-  putRow(data.outputRowMeta, values);
+boolean hasNext = true;
+CarbonIterator<String[]> iter;
+boolean isInitialized = false;
+while (hasNext) {
+  iter = getRddIterator(isInitialized);
--- End diff --

ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #537: fix unapproved licenses

2017-01-16 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/537
  
@ravipesala 
already added header check to java stype 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #537: fix unapproved licenses

2017-01-17 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/537
  
@chenliang613 
java's license header is the same with scala's.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #537: fix unapproved licenses

2017-01-15 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/537#discussion_r96141120
  
--- Diff: pom.xml ---
@@ -439,15 +439,22 @@
 **/*.csv
 **/*.dictionary
 **/*.ktr
+**/*.rat
 **/_SUCCESS
 **/non-csv
 **/.invisibilityfile
 **/noneCsvFormat.cs
 
**/org.apache.spark.sql.sources.DataSourceRegister
+
**/org.apache.spark.sql.test.TestQueryExecutorRegister
 **/derby.log
 **/meta.lock
 **/loadmetadata.metadata
 **/modifiedTime.mdt
+**/PULL_REQUEST_TEMPLATE.md
+**/dict.txt
+**/dict.txt
--- End diff --

fixed. remove one 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #537: fix unapproved licenses

2017-01-16 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/537
  
@chenliang613 @jackylk @ravipesala 
now the license header of Java file is different with the license header of 
Scala file. 

Which one we should choose?
java file header:
```
/*
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */
```
scala file header:(same with spark)
```
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #537: fix unapproved licenses

2017-01-16 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/537
  
I agree.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #217: [CARBONDATA-287]Using multi local directory...

2016-11-23 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/217
  
close this pr


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #217: [CARBONDATA-287]Using multi local di...

2016-11-23 Thread QiangCai
Github user QiangCai closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/217


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #285: [WIP]Insert into carbon table feature

2016-11-25 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/285
  
@ashokblend please rebase


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #360: [CARBONDATA-462] Clean up carbonTableSchema...

2016-11-28 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/360
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #278: [CARBONDATA-368]Imporve performance of data...

2016-11-28 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/278
  
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/682/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #363: [CARBONDATA-461] clean partitioner in carbo...

2016-11-28 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/363
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #278: [CARBONDATA-368]Imporve performance ...

2016-11-28 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/278#discussion_r89950862
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/csvreaderstep/CsvInput.java
 ---
@@ -384,21 +383,75 @@ public boolean processRow(StepMetaInterface smi, 
StepDataInterface sdi) throws K
   
CarbonTimeStatisticsFactory.getLoadStatisticsInstance().recordCsvInputStepTime(
   meta.getPartitionID(), System.currentTimeMillis());
 } else {
-  scanRddIterator();
+  scanRddIterator(numberOfNodes);
 }
 setOutputDone();
 return false;
   }
 
-  private void scanRddIterator() throws RuntimeException {
-Iterator<String[]> iterator = 
RddInputUtils.getAndRemove(rddIteratorKey);
-if (iterator != null) {
-  try{
-while(iterator.hasNext()){
-  putRow(data.outputRowMeta, iterator.next());
+  class RddScanCallable implements Callable {
+List<JavaRddIterator<String[]>> iterList;
+
+RddScanCallable() {
+  this.iterList = new ArrayList<JavaRddIterator<String[]>>(1000);
+}
+
+public void addJavaRddIterator(JavaRddIterator<String[]> iter) {
+  this.iterList.add(iter);
+}
+
+@Override
+public Void call() throws Exception {
+  StandardLogService.setThreadName(("PROCESS_DataFrame_PARTITIONS"),
+  Thread.currentThread().getName());
+  try {
+String[] values = null;
+for (JavaRddIterator<String[]> iter: iterList) {
+  iter.initialize();
+  while (iter.hasNext()) {
+values = iter.next();
+synchronized (putRowLock) {
+  putRow(data.outputRowMeta, values);
+}
+  }
+}
+  } catch (Exception e) {
+LOGGER.error(e, "Scan rdd during data load is terminated due to 
error.");
+throw e;
+  }
+  return null;
+}
+  };
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #278: [CARBONDATA-368]Imporve performance ...

2016-11-28 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/278#discussion_r89950872
  
--- Diff: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
 ---
@@ -932,7 +942,8 @@ object CarbonDataRDDFactory {
   loadDataFile()
 }
 val newStatusMap = scala.collection.mutable.Map.empty[String, 
String]
-status.foreach { eachLoadStatus =>
+if (status.nonEmpty) {
+  status.foreach { eachLoadStatus =>
   val state = newStatusMap.get(eachLoadStatus._1)
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #278: [CARBONDATA-368]Imporve performance ...

2016-11-28 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/278#discussion_r89950829
  
--- Diff: 
integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataLoadRDD.scala
 ---
@@ -548,77 +552,53 @@ class DataFrameLoaderRDD[K, V](
   override protected def getPartitions: Array[Partition] = 
firstParent[Row].partitions
 }
 
+class PartitionIterator(partitionIter: 
Iterator[DataLoadPartitionWrap[Row]],
+carbonLoadModel: CarbonLoadModel,
+context: TaskContext) extends  
JavaRddIterator[JavaRddIterator[Array[String]]] {
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #278: [CARBONDATA-368]Imporve performance ...

2016-11-17 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/278#discussion_r88588734
  
--- Diff: 
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
 ---
@@ -164,4 +165,53 @@ object CarbonScalaUtil extends Logging {
 kettleHomePath
   }
 
+  def getString(value: Any,
+  serializationNullFormat: String,
+  delimiterLevel1: String,
+  delimiterLevel2: String,
+  format: SimpleDateFormat,
+  level: Int = 1): String = {
+value == null match {
+  case true => serializationNullFormat
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #278: [CARBONDATA-368]Imporve performance of data...

2016-11-20 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/278
  
Rebase done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #366: [WIP][CARBONDATA-368]Insert into carbon tab...

2016-11-29 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/366
  
please rebase


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #413: [CARBONDATA-516][SPARK2]fix union is...

2016-12-08 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/413

[CARBONDATA-516][SPARK2]fix union issue in CarbonLateDecoderRule

In spark2, Union class is no longer the sub-class of BinaryNode. We need 
fix union issue in CarbonLateDecoderRule for spark2.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata fixUnionIssue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #413


commit a86342453ab969107944182ead36ac4cf80f74ef
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-08T11:06:33Z

fixUnionIssue




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #339: [CARBONDATA-429][WIP] Remove unneces...

2016-12-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/339#discussion_r92332659
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/cache/dictionary/ReverseDictionaryCache.java
 ---
@@ -167,12 +167,9 @@ private Dictionary getDictionary(
   DictionaryColumnUniqueIdentifier dictionaryColumnUniqueIdentifier)
   throws CarbonUtilException {
 Dictionary reverseDictionary = null;
-// create column dictionary info object only if dictionary and its
-// metadata file exists for a given column identifier
-if (!isFileExistsForGivenColumn(dictionaryColumnUniqueIdentifier)) {
-  throw new CarbonUtilException(
-  "Either dictionary or its metadata does not exist for column 
identifier :: "
-  + dictionaryColumnUniqueIdentifier.getColumnIdentifier());
+// create column dictionary info object only if it is primitive type.
+if (dictionaryColumnUniqueIdentifier.getDataType().isComplexType()) {
+  return null;
--- End diff --

We will not invoke getDictionary() method on  complex type column directly. 
 For complex type column, we call getDictionary() method on primitive type 
sub-column.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #411: [WIP]Support data type: date and char

2016-12-14 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/411
  
@ravipesala 
Please correct CI to support -Pbuild-with-format.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #413: [CARBONDATA-516][SPARK2]fix union issue in ...

2016-12-15 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/413
  
@jackylk
Added test case 
Local test case pass for spark1.5 and spark2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #439: [CARBONDATA-536]initialize updateTab...

2016-12-15 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/439

[CARBONDATA-536]initialize updateTableMetadata method in LoadTable for 
Spark2

For spark2, GlobalDictionaryUtil.updateTableMetadataFunc should been 
initialized

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixBugInLoadTable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #439


commit 1f28ef9864e2d45807bb7c6bb1cbb51f65f423e3
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-15T16:26:09Z

fixLoadTableForSpark2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #345: [CARBONDATA-443]Nosort dataloading

2016-12-15 Thread QiangCai
Github user QiangCai closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/345


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #345: [CARBONDATA-443]Nosort dataloading

2016-12-15 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/345
  
Close this PR. In the future, I will raise another PR to support mixed data 
format table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #518: [WIP]unify file header reader

2017-01-10 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/518

[WIP]unify file header reader



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata fileheader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #518


commit 5440b9c16799d935f9da1728344564a65a2d6ef2
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-10T13:32:51Z

readfileheader




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...

2017-01-11 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/524#discussion_r95714873
  
--- Diff: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/AllDataTypesTestCaseAggregate.scala
 ---
@@ -59,21 +59,4 @@ class AllDataTypesTestCaseAggregate extends QueryTest 
with BeforeAndAfterAll {
   Seq(Row(15.8)))
   })
 
-  test("CARBONDATA-60-union-defect")({
--- End diff --

Because the previous builder 559 added one test case, so the builder 560 
has two deleted test case. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #524: [CARBONDATA-627]fix union test case ...

2017-01-11 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/524

 [CARBONDATA-627]fix union test case for spark2

Analyze:
Union test case failed in spark2. The result of union query is twice of the 
result of left query.

Root Cause:
CarbonLateDecodeRule only use union.children.head plan to build all 
CarbonDictionaryTempDecoder.

Changes:
Use child plan to build each CarbonDictionaryTempDecoder.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata fixUnionTestCase

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/524.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #524


commit 0abc4f8f1fe6cfe0e8fe8842f7b7ba40f1e191a7
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-11T15:47:25Z

fixUnionTestCase




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #520: fix dependency issue for IntelliJ IDEA

2017-01-11 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/520
  
close this pr. I didn't reproduce this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #518: [CARBONDATA-622]unify file header re...

2017-01-10 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/518#discussion_r95518312
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
 ---
@@ -301,4 +304,45 @@ object CommonUtil {
   LOGGER.info(s"mapreduce.input.fileinputformat.split.maxsize: ${ 
newSplitSize.toString }")
 }
   }
+
+  def getCsvHeaderColumns(carbonLoadModel: CarbonLoadModel): Array[String] 
= {
+val delimiter = if 
(StringUtils.isEmpty(carbonLoadModel.getCsvDelimiter)) {
+  CarbonCommonConstants.COMMA
+} else {
+  CarbonUtil.delimiterConverter(carbonLoadModel.getCsvDelimiter)
+}
+var csvFile: String = null
+var csvHeader: String = carbonLoadModel.getCsvHeader
+val csvColumns = if (StringUtils.isBlank(csvHeader)) {
+  // read header from csv file
+  csvFile = carbonLoadModel.getFactFilePath.split(",")(0)
+  csvHeader = CarbonUtil.readHeader(csvFile)
+  if (StringUtils.isBlank(csvHeader)) {
+throw new CarbonDataLoadingException("First line of the csv is not 
valid.")
+  }
+  csvHeader.toLowerCase().split(delimiter).map(_.replaceAll("\"", 
"").trim)
+} else {
+  csvHeader.toLowerCase.split(CarbonCommonConstants.COMMA).map(_.trim)
+}
+
+if 
(!CarbonDataProcessorUtil.isHeaderValid(carbonLoadModel.getTableName, 
csvColumns,
+carbonLoadModel.getCarbonDataLoadSchema)) {
+  if (csvFile == null) {
+LOGGER.error("CSV header provided in DDL is not proper."
+ + " Column names in schema and CSV header are not the 
same.")
+throw new CarbonDataLoadingException(
+  "CSV header provided in DDL is not proper. Column names in 
schema and CSV header are "
+  + "not the same.")
+  } else {
+LOGGER.error(
+  "CSV File provided is not proper. Column names in schema and csv 
header are not same. "
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #518: [CARBONDATA-622]unify file header re...

2017-01-10 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/518#discussion_r95518311
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -373,83 +368,15 @@ private static void 
addAllComplexTypeChildren(CarbonDimension dimension, StringB
 return complexTypesMap;
   }
 
-  /**
-   * Get the csv file to read if it the path is file otherwise get the 
first file of directory.
-   *
-   * @param csvFilePath
-   * @return File
-   */
-  public static CarbonFile getCsvFileToRead(String csvFilePath) {
-CarbonFile csvFile =
-FileFactory.getCarbonFile(csvFilePath, 
FileFactory.getFileType(csvFilePath));
-
-CarbonFile[] listFiles = null;
-if (csvFile.isDirectory()) {
-  listFiles = csvFile.listFiles(new CarbonFileFilter() {
-@Override public boolean accept(CarbonFile pathname) {
-  if (!pathname.isDirectory()) {
-if 
(pathname.getName().endsWith(CarbonCommonConstants.CSV_FILE_EXTENSION) || 
pathname
-
.getName().endsWith(CarbonCommonConstants.CSV_FILE_EXTENSION
-+ CarbonCommonConstants.FILE_INPROGRESS_STATUS)) {
-  return true;
-}
-  }
-  return false;
-}
-  });
-} else {
-  listFiles = new CarbonFile[1];
-  listFiles[0] = csvFile;
-}
-return listFiles[0];
-  }
-
-  /**
-   * Get the file header from csv file.
-   */
-  public static String getFileHeader(CarbonFile csvFile)
-  throws DataLoadingException {
-DataInputStream fileReader = null;
-BufferedReader bufferedReader = null;
-String readLine = null;
-
-FileType fileType = FileFactory.getFileType(csvFile.getAbsolutePath());
-
-if (!csvFile.exists()) {
-  csvFile = FileFactory
-  .getCarbonFile(csvFile.getAbsolutePath() + 
CarbonCommonConstants.FILE_INPROGRESS_STATUS,
-  fileType);
-}
-
-try {
-  fileReader = 
FileFactory.getDataInputStream(csvFile.getAbsolutePath(), fileType);
-  bufferedReader =
-  new BufferedReader(new InputStreamReader(fileReader, 
Charset.defaultCharset()));
-  readLine = bufferedReader.readLine();
-} catch (FileNotFoundException e) {
-  LOGGER.error(e, "CSV Input File not found  " + e.getMessage());
-  throw new DataLoadingException("CSV Input File not found ", e);
-} catch (IOException e) {
-  LOGGER.error(e, "Not able to read CSV input File  " + 
e.getMessage());
-  throw new DataLoadingException("Not able to read CSV input File ", 
e);
-} finally {
-  CarbonUtil.closeStreams(fileReader, bufferedReader);
-}
-
-return readLine;
-  }
-
-  public static boolean isHeaderValid(String tableName, String header,
-  CarbonDataLoadSchema schema, String delimiter) throws 
DataLoadingException {
-delimiter = CarbonUtil.delimiterConverter(delimiter);
+  public static boolean isHeaderValid(String tableName, String[] csvHeader,
+  CarbonDataLoadSchema schema) throws DataLoadingException {
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #518: [CARBONDATA-622]unify file header re...

2017-01-10 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/518#discussion_r95518309
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
 ---
@@ -462,6 +389,13 @@ public static boolean isHeaderValid(String tableName, 
String header,
 return count == columnNames.length;
   }
 
+  public static boolean isHeaderValid(String tableName, String header,
+  CarbonDataLoadSchema schema, String delimiter) throws 
DataLoadingException {
+delimiter = CarbonUtil.delimiterConverter(delimiter);
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #530: fix default profile for spark-common...

2017-01-12 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/530

fix default profile for spark-common-test

now the profile spark-1.6 should be active by default. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixDefaultProfile

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/530.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #530


commit 67415ad64a823f5cf303cd22676b4f9cfc2b78f5
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-13T02:59:15Z

fix default profile




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #529: Fixed testcase issues in spark 1.6 and 2.1 ...

2017-01-12 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/529
  
@ravipesala 
ok,  no conflict, just need do rebase.
PR528  is for kettle flow in spark2 and move test case to spark-common-test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #529: [WIP]Fixed testcase issues in spark 1.6 and...

2017-01-12 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/529
  
@ravipesala 
In PR 528, I fixed InsertInto issue for kettle flow and move test case to 
spark-common-test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #528: [CARBONDATA-617]Fix InsertInto test ...

2017-01-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/528#discussion_r95973885
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/optimizer/CarbonOptimizer.scala
 ---
@@ -237,9 +237,15 @@ class ResolveCarbonFunctions(relations: 
Seq[CarbonDecoderRelation])
   val leftCondAttrs = new util.HashSet[AttributeReferenceWrapper]
   val rightCondAttrs = new util.HashSet[AttributeReferenceWrapper]
   union.left.output.foreach(attr =>
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #528: [CARBONDATA-617]Fix InsertInto test ...

2017-01-13 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/528#discussion_r95973907
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetastore.scala
 ---
@@ -782,7 +781,49 @@ case class CarbonRelation(
 nullable = true)())
   }
 
-  override val output = dimensionsAttr ++ measureAttr
+  override val output = {
+val columns = 
tableMeta.carbonTable.getCreateOrderColumn(tableMeta.carbonTable.getFactTableName)
+  .asScala
+columns.filter(!_.isInvisible).map { column =>
+  if (column.isDimesion()) {
+val output: DataType = column.getDataType.toString.toLowerCase 
match {
+  case "array" =>
+
CarbonMetastoreTypes.toDataType(s"array<${getArrayChildren(column.getColName)}>")
+  case "struct" =>
+
CarbonMetastoreTypes.toDataType(s"struct<${getStructChildren(column.getColName)}>")
+  case dType =>
+val dataType = addDecimalScaleAndPrecision(column, dType)
+CarbonMetastoreTypes.toDataType(dataType)
+}
+AttributeReference(column.getColName, output,
+  nullable = true
+)(qualifier = Option(tableName + "." + column.getColName))
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #514: [CARBONDATA-614]Fix issue: Dictionary file ...

2017-01-09 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/514
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #514: [CARBONDATA-614]Fix issue: Dictionar...

2017-01-09 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/514

[CARBONDATA-614]Fix issue:  Dictionary file name is locked for updation.

1.  set carbon property CarbonCommonConstants.STORE_LOCATION for 
CarbonBlockDistinctValuesCombineRDD and CarbonGlobalDictionaryGenerateRDD to 
avoid  java.lang.RuntimeException: Dictionary file name is locked for updation. 

2. pass CARBON_TIMESTAMP_FORMAT to executor side from driver side during 
dcitonary generation

3. fix code style for carbonTableSchema.scala

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixDictLockedIssue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/514.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #514


commit 5ab7d80e7bd65f31b462d8687ee7aefd326cbc02
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-10T06:34:20Z

fixDictLockIssue




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #520: fix dependency issue for IntelliJ ID...

2017-01-11 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/520

fix dependency issue for IntelliJ IDEA

When using profile spark-2.1, can not run test case of spark-common-test in 
IntelliJ IDEA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixIdeaMavenIssue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #520


commit 74d4bf8933540348525a16fdba361e780fe0f494
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-11T08:37:24Z

fix dependency issue for IntelliJ IDEA




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #518: [CARBONDATA-622]unify file header re...

2017-01-10 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/518#discussion_r95507937
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
 ---
@@ -301,4 +304,45 @@ object CommonUtil {
   LOGGER.info(s"mapreduce.input.fileinputformat.split.maxsize: ${ 
newSplitSize.toString }")
 }
   }
+
+  def getCsvHeaderColumns(carbonLoadModel: CarbonLoadModel): Array[String] 
= {
+val delimiter = if 
(StringUtils.isEmpty(carbonLoadModel.getCsvDelimiter)) {
--- End diff --

I think the delimiter maybe a blank " "


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #528: Fix InsertInto test case for spark2

2017-01-12 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/528

Fix InsertInto test case  for spark2

Changes:
1. move insertInto test case to spark-common-test module from spark module 
2. add test case: insert into carbon table from carbon table union query
3. CarbonDecoderOptimizerHelper support InsertIntoTable for spark2
4. CreateTable and CarbonRelation use origin ordinal of columns for spark2
5. Optimize CSVInput for InsertInto to avoid to allocate too much memory at 
once.

Impaction:
1. dataloading 
2. query

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixInsertIntoFromUnionQuery

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/528.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #528


commit 96e94fb967936192520f64dd8404e148c7e5fad2
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-12T17:26:30Z

fix InsertInto issue for spark2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #390: [CARBONDATA-492]fix a bug of profile...

2016-12-03 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/390

[CARBONDATA-492]fix a bug of profile spark-2.0 for intellij idea

When profile spark-2.0 is chosen , CarbonExample have error in Intellij idea

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixprofileforidea

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/390.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #390


commit d626af30664948b149218859271af88cc41853e2
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-03T18:16:18Z

fix profile issue for idea




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #377: [CARBONDATA-478]Spark2 module should...

2016-12-01 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/377

[CARBONDATA-478]Spark2 module should have different SparkRowReadSupportImpl 
with spark1



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/377.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #377


commit 3bc55a38c5d645ca1e07381910692ac0b2bb6297
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-01T11:32:04Z

fixLatedecoderIssueForSpark2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #384: [CARBONDATA-488][SPARK2]add InsertIn...

2016-12-02 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/384

[CARBONDATA-488][SPARK2]add InsertInto feature for spark2

1. add InsertInto feature for spark2

2. optimize CarbonExample to use relation path
And use InsertInto to load data

Link:
[CARBONDATA-488](https://issues.apache.org/jira/browse/CARBONDATA-488)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
insertinto_for_spark2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/384.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #384


commit a1b3b2962a8e12f09cf5efabc15de071e105c885
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-02T17:53:32Z

insertinto for spark2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #382: [CARBONDATA-486]fix bug for reading ...

2016-12-02 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/382

[CARBONDATA-486]fix bug for reading dataframe concurrently

Fix a insertinto bug for reading from hive table concurrently

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixbugforinsertinto2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/382.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #382


commit 9dcdf7de6bde64d1c800fd268f2099d2278e8f33
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-02T09:41:23Z

fix bug for reading dataframe concurrently




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #403: [CARBONDATA-497][SPARK2]fix datatype...

2016-12-06 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/403

[CARBONDATA-497][SPARK2]fix datatype issue of CarbonLateDecoderRule

1.  Fix the data type of dictionary dimension to resolve the logical plan 

2. Perfect  translateFilter method to push down more filters to 
CarbonScanRDD.

3. Add decimal type field to CarbonExample

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixbugforlatedecoder

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/403.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #403


commit 7159713725ac6bef057e27144021cdd06e4adba0
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-06T09:40:21Z

fixlatedecoder




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #472: [CARBONDATA-568] clean up code for c...

2017-01-03 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/472#discussion_r94367280
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/carbon/datastore/chunk/reader/dimension/v2/CompressedDimensionChunkFileBasedReaderV2.java
 ---
@@ -135,6 +135,7 @@ public CompressedDimensionChunkFileBasedReaderV2(final 
BlockletInfo blockletInfo
   dimensionChunk = fileReader.readByteArray(filePath, 
dimensionChunksOffset.get(blockIndex),
   dimensionChunksLength.get(blockIndex));
   dimensionColumnChunk = CarbonUtil.readDataChunk(dimensionChunk);
+  assert dimensionColumnChunk != null;
--- End diff --

I prefer to throw an exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #499: [CARBONDATA-218]fix data loading iss...

2017-01-04 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/499

[CARBONDATA-218]fix data loading issue for UT



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
fixDataLoadingIssue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/499.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #499


commit 0cfefbb450b596da23f87a9cab65016c94f96a0a
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-05T03:03:25Z

fixDataLoadingIssue




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #450: [CARBONDATA-545]Added support for offheap s...

2017-01-04 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/450
  
@kumarvishal09 
please rebase and fix some known issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #481: [CARBONDATA-601]reuse test case for ...

2017-01-07 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/481#discussion_r95060180
  
--- Diff: integration/spark-common-test/pom.xml ---
@@ -0,0 +1,232 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+
+  4.0.0
+
+  
+org.apache.carbondata
+carbondata-parent
+1.0.0-incubating-SNAPSHOT
+../../pom.xml
+  
+
+  carbondata-spark-common-test
+  Apache CarbonData :: Spark Common Test
+
+  
+${basedir}/../../dev
+  
+
+  
+
+  org.apache.carbondata
+  carbondata-spark-common
+  ${project.version}
+  test
+  
+
+  org.apache.spark
+  spark-hive-thriftserver_2.10
+
+  
+
+
+  org.apache.spark
+  
spark-hive-thriftserver_${scala.binary.version}
+  test
+
+
+  junit
+  junit
+
+
+  org.scalatest
+  scalatest_${scala.binary.version}
+  2.2.1
+  test
+
+  
+
+  
+src/test/scala
+
+  
+src/resources
+  
+  
+.
+
+  CARBON_SPARK_INTERFACELogResource.properties
+
+  
+
+
+  
+org.scala-tools
+maven-scala-plugin
+2.15.2
+
+  
+compile
+
+  compile
+
+compile
+  
+  
+testCompile
+
+  testCompile
+
+test
+  
+  
+process-resources
+
+  compile
+
+  
+
+  
+  
+maven-compiler-plugin
+
+  1.7
+  1.7
+
+  
+  
+org.apache.maven.plugins
+maven-surefire-plugin
+2.18
+
+
+  
+**/Test*.java
+**/*Test.java
+**/*TestCase.java
+**/*Suite.java
+  
+  
${project.build.directory}/surefire-reports
+  -Xmx3g -XX:MaxPermSize=512m 
-XX:ReservedCodeCacheSize=512m
+  
+true
+  
+  false
+
+  
+  
+org.scalatest
+scalatest-maven-plugin
+1.0
+
+
+  
${project.build.directory}/surefire-reports
+  .
+  CarbonTestSuite.txt
+  -ea -Xmx3g -XX:MaxPermSize=512m 
-XX:ReservedCodeCacheSize=512m 
+  
+  
+  
+  
+  
+true
+${use.kettle}
+  
+
+
+  
+test
+
+  test
+
+  
+
+  
+
+  
+  
+
+  spark-1.5
+  
+true
+  
+  
+
+  org.apache.carbondata
+  carbondata-spark
+  ${project.version}
+  test
+  
+
+  org.apache.spark
+  spark-hive-thriftserver_2.10
--- End diff --

This exclusion will be fixed dependency issue of Intellij Idea


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #481: [CARBONDATA-601]reuse test case for ...

2017-01-07 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/481#discussion_r95060705
  
--- Diff: 
integration/spark/src/main/scala/org/apache/spark/sql/test/TestQueryExecutorImplV1.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.test
+
+import org.apache.spark.{SparkConf, SparkContext}
+import org.apache.spark.sql.{CarbonContext, DataFrame, SQLContext}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+
+class TestQueryExecutorImplV1 extends TestQueryExecutorRegister {
--- End diff --

fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #508: [CARBONDATA-611] Make the default maven com...

2017-01-09 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/508
  
@ravipesala 
In my opinion, If some profile is active by default and only contains some 
property elements, we can remove this profile and add these property elements 
into properties element.
For assembly/pom.xml, I prefer to add these three provided property 
elements into properties element, but not each profile.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #508: [CARBONDATA-611] Make the default maven com...

2017-01-09 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/508
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #481: [WIP]reuse test case for integration...

2016-12-29 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/481

[WIP]reuse test case for integration module



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata 
refactoryTestCase

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/481.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #481


commit ea6ac0d143a7af33124bc647c7e6a99dafe012c1
Author: QiangCai <qiang...@qq.com>
Date:   2016-12-29T14:43:29Z

reuse test case for integration module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #494: [CARBONDATA-218]Using CSVInputFormat...

2017-01-03 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/494

[CARBONDATA-218]Using CSVInputFormat instead of spark-csv during dictionary 
geneartion

1. Using CSVInputFormat instead of spark-csv during dictionary geneartion

2. Remove spark-csv dependency from whole project.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata unifyCsvReader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/494.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #494


commit 34f55769d74969cff5d9a322ad8ae0cbf14befdc
Author: QiangCai <qiang...@qq.com>
Date:   2017-01-03T08:28:06Z

unify csv reader




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #494: [CARBONDATA-218]Using CSVInputFormat...

2017-01-04 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/494#discussion_r94545504
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
 ---
@@ -356,37 +363,49 @@ object GlobalDictionaryUtil {
*/
   def loadDataFrame(sqlContext: SQLContext,
   carbonLoadModel: CarbonLoadModel): DataFrame = {
-val df = sqlContext.read
-  .format("com.databricks.spark.csv.newapi")
-  .option("header", {
-if (StringUtils.isEmpty(carbonLoadModel.getCsvHeader)) {
-  "true"
-} else {
-  "false"
-}
-  })
-  .option("delimiter", {
-if (StringUtils.isEmpty(carbonLoadModel.getCsvDelimiter)) {
-  "" + DEFAULT_SEPARATOR
-} else {
-  carbonLoadModel.getCsvDelimiter
+  val hadoopConfiguration = new Configuration()
+  CommonUtil.configureCSVInputFormat(hadoopConfiguration, 
carbonLoadModel)
+  hadoopConfiguration.set(FileInputFormat.INPUT_DIR, 
carbonLoadModel.getFactFilePath)
--- End diff --

FileInputFormat.addInputPath method need a Job type paramter.
In addition, this FactFilePath already consist of all file path, we can 
directly set input path, no need to separate path and add path again. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #481: [CARBONDATA-601]reuse test case for integra...

2017-01-07 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/481
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #481: [CARBONDATA-601]reuse test case for integra...

2017-01-06 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/481
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #452: [CARBONDATA-546] Extract data manage...

2016-12-22 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/452#discussion_r93733275
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/util/TableAPIUtil.scala ---
@@ -51,4 +56,15 @@ object TableAPIUtil {
 .config(CarbonCommonConstants.STORE_LOCATION, storePath)
--- End diff --

Can you remove this one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #452: [CARBONDATA-546] Extract data manage...

2016-12-22 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/452#discussion_r93733676
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/util/TableAPIUtil.scala ---
@@ -51,4 +56,15 @@ object TableAPIUtil {
 .config(CarbonCommonConstants.STORE_LOCATION, storePath)
--- End diff --

BTW, now CarbonEnv will get  carbon.storelocation from CarobnProperties 
object.
So need to add storepath to configuration, but should add property to 
CarbonProperties Object. 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.STORE_LOCATION,
 storePath).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #449: [CARBONDATA-540]Support insertInto without ...

2016-12-27 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/449
  
@chenliang613 @jackylk 
Rebase done and fixed comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107841211
  
--- Diff: dev/java-code-format-template.xml ---
@@ -34,8 +34,8 @@
   
 
   
-  
   
+  
--- End diff --

Yes. javax package should be after java package.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107890848
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonOutputFormat.java
 ---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+
+import java.io.IOException;
+import java.util.Properties;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.exec.FileSinkOperator;
+import org.apache.hadoop.hive.ql.io.HiveOutputFormat;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.FileOutputFormat;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordWriter;
+import org.apache.hadoop.util.Progressable;
+
+
+public class MapredCarbonOutputFormat extends FileOutputFormat<Void, T>
--- End diff --

Is same with CarbonTableOutputFormat?

So we only support reading carbondata table in hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107863484
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonArrayInspector.java
 ---
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableListObjectInspector;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.Writable;
+
+/**
+ * The CarbonHiveArrayInspector will inspect an ArrayWritable, considering 
it as an Hive array.
+ * It can also inspect a List if Hive decides to inspect the result of an 
inspection.
+ */
+public class CarbonArrayInspector implements SettableListObjectInspector {
+
+  ObjectInspector arrayElementInspector;
+
+  public CarbonArrayInspector(final ObjectInspector arrayElementInspector) 
{
+this.arrayElementInspector = arrayElementInspector;
+  }
+
+  @Override
+  public String getTypeName() {
+return "array<" + arrayElementInspector.getTypeName() + ">";
+  }
+
+  @Override
+  public Category getCategory() {
+return Category.LIST;
+  }
+
+  @Override
+  public ObjectInspector getListElementObjectInspector() {
+return arrayElementInspector;
+  }
+
+  @Override
+  public Object getListElement(final Object data, final int index) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  if (index >= 0 && index < ((ArrayWritable) subObj).get().length) {
+return ((ArrayWritable) subObj).get()[index];
+  } else {
+return null;
+  }
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public int getListLength(final Object data) {
+if (data == null) {
+  return -1;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return -1;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return 0;
+  }
+
+  return ((ArrayWritable) subObj).get().length;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public List getList(final Object data) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  final Writable[] array = ((ArrayWritable) subObj).get();
+  final List list = new ArrayList();
+
+  for (final Writable obj : array) {
+list.add(obj);
+  }
+
+  return list;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
  

[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107863410
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonArrayInspector.java
 ---
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableListObjectInspector;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.Writable;
+
+/**
+ * The CarbonHiveArrayInspector will inspect an ArrayWritable, considering 
it as an Hive array.
+ * It can also inspect a List if Hive decides to inspect the result of an 
inspection.
+ */
+public class CarbonArrayInspector implements SettableListObjectInspector {
+
+  ObjectInspector arrayElementInspector;
+
+  public CarbonArrayInspector(final ObjectInspector arrayElementInspector) 
{
+this.arrayElementInspector = arrayElementInspector;
+  }
+
+  @Override
+  public String getTypeName() {
+return "array<" + arrayElementInspector.getTypeName() + ">";
+  }
+
+  @Override
+  public Category getCategory() {
+return Category.LIST;
+  }
+
+  @Override
+  public ObjectInspector getListElementObjectInspector() {
+return arrayElementInspector;
+  }
+
+  @Override
+  public Object getListElement(final Object data, final int index) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  if (index >= 0 && index < ((ArrayWritable) subObj).get().length) {
+return ((ArrayWritable) subObj).get()[index];
+  } else {
+return null;
+  }
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public int getListLength(final Object data) {
+if (data == null) {
+  return -1;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return -1;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return 0;
+  }
+
+  return ((ArrayWritable) subObj).get().length;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public List getList(final Object data) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  final Writable[] array = ((ArrayWritable) subObj).get();
+  final List list = new ArrayList();
+
+  for (final Writable obj : array) {
+list.add(obj);
+  }
+
+  return list;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
  

[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107881262
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonHiveRecordReader.java
 ---
@@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+
+import java.io.IOException;
+import java.sql.Date;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+
+import org.apache.carbondata.core.datastore.block.TableBlockInfo;
+import 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator;
+import org.apache.carbondata.hadoop.CarbonRecordReader;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.common.type.HiveDecimal;
+import org.apache.hadoop.hive.serde.serdeConstants;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.DoubleWritable;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hadoop.hive.serde2.io.ShortWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.*;
+import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+
+public class CarbonHiveRecordReader extends 
CarbonRecordReader
+implements org.apache.hadoop.mapred.RecordReader<Void, ArrayWritable> {
+
+  ArrayWritable valueObj = null;
+  private CarbonObjectInspector objInspector;
+
+  public CarbonHiveRecordReader(QueryModel queryModel, 
CarbonReadSupport readSupport,
+InputSplit inputSplit, JobConf jobConf) 
throws IOException {
+super(queryModel, readSupport);
+initialize(inputSplit, jobConf);
+  }
+
+  public void initialize(InputSplit inputSplit, Configuration conf) throws 
IOException {
+// The input split can contain single HDFS block or multiple blocks, 
so firstly get all the
+// blocks and then set them in the query model.
+List splitList;
+if (inputSplit instanceof CarbonHiveInputSplit) {
+  splitList = new ArrayList<>(1);
+  splitList.add((CarbonHiveInputSplit) inputSplit);
+} else {
+  throw new RuntimeException("unsupported input split type: " + 
inputSplit);
+}
+List tableBlockInfoList = 
CarbonHiveInputSplit.createBlocks(splitList);
+queryModel.setTableBlockInfos(tableBlockInfoList);
+readSupport.initialize(queryModel.getProjectionColumns(),
+queryModel.getAbsoluteTableIdentifier());
+try {
+  carbonIterator = new 
ChunkRowIterator(queryExecutor.execute(queryModel));
+} catch (QueryExecutionException e) {
+  throw new IOException(e.getMessage(), e.getCause());
+}
+if (valueObj == null) {
+  valueObj = new ArrayWritable(Writable.class,
+  new Writable[queryModel.getProjectionColumns().length]);
+}
+
+final TypeInfo rowTypeInfo;
+final List columnNames;
+List columnTypes;
+// Get column names and sort orde

[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107862824
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonArrayInspector.java
 ---
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableListObjectInspector;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.Writable;
+
+/**
+ * The CarbonHiveArrayInspector will inspect an ArrayWritable, considering 
it as an Hive array.
+ * It can also inspect a List if Hive decides to inspect the result of an 
inspection.
+ */
+public class CarbonArrayInspector implements SettableListObjectInspector {
+
+  ObjectInspector arrayElementInspector;
+
+  public CarbonArrayInspector(final ObjectInspector arrayElementInspector) 
{
+this.arrayElementInspector = arrayElementInspector;
+  }
+
+  @Override
+  public String getTypeName() {
+return "array<" + arrayElementInspector.getTypeName() + ">";
+  }
+
+  @Override
+  public Category getCategory() {
+return Category.LIST;
+  }
+
+  @Override
+  public ObjectInspector getListElementObjectInspector() {
+return arrayElementInspector;
+  }
+
+  @Override
+  public Object getListElement(final Object data, final int index) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  if (index >= 0 && index < ((ArrayWritable) subObj).get().length) {
+return ((ArrayWritable) subObj).get()[index];
+  } else {
+return null;
+  }
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public int getListLength(final Object data) {
+if (data == null) {
+  return -1;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return -1;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return 0;
+  }
+
+  return ((ArrayWritable) subObj).get().length;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public List getList(final Object data) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  final Writable[] array = ((ArrayWritable) subObj).get();
+  final List list = new ArrayList();
+
+  for (final Writable obj : array) {
--- End diff --

Better to use Arrays.asList(array)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
ena

[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107873741
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
 ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import org.apache.carbondata.core.scan.model.CarbonQueryPlan;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.hadoop.CarbonInputFormat;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.mapred.InputFormat;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordReader;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapreduce.Job;
+
+
+public class MapredCarbonInputFormat extends 
CarbonInputFormat
--- End diff --

CarbonInputFormat is the implement of MRv2. MapredCarbonInputFormat is a 
implement of MRv1.
So I think MapredCarbonInputFormat shouldn't extend from CarbonInputFormat.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107893281
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
 ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import org.apache.carbondata.core.scan.model.CarbonQueryPlan;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.hadoop.CarbonInputFormat;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.mapred.InputFormat;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordReader;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapreduce.Job;
+
+
+public class MapredCarbonInputFormat extends 
CarbonInputFormat
+implements InputFormat<Void, ArrayWritable>, 
CombineHiveInputFormat.AvoidSplitCombination {
+
+  @Override
+  public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws 
IOException {
+org.apache.hadoop.mapreduce.JobContext jobContext = 
Job.getInstance(jobConf);
+List splitList = 
super.getSplits(jobContext);
--- End diff --

for hive, need remove InputSplit of Invalid Segments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107875522
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonHiveRecordReader.java
 ---
@@ -0,0 +1,249 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+
+import java.io.IOException;
+import java.sql.Date;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+
+import org.apache.carbondata.core.datastore.block.TableBlockInfo;
+import 
org.apache.carbondata.core.scan.executor.exception.QueryExecutionException;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator;
+import org.apache.carbondata.hadoop.CarbonRecordReader;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.common.type.HiveDecimal;
+import org.apache.hadoop.hive.serde.serdeConstants;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import org.apache.hadoop.hive.serde2.io.DateWritable;
+import org.apache.hadoop.hive.serde2.io.DoubleWritable;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hadoop.hive.serde2.io.ShortWritable;
+import org.apache.hadoop.hive.serde2.io.TimestampWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.*;
+import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.io.Writable;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+
+public class CarbonHiveRecordReader extends 
CarbonRecordReader
--- End diff --

CarbonRecordReader is for MRv2, CarbonHiveRecordReader is for MRv1.
CarbonHiveRecordReader shouldn't extend from CarbonRecordReader.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107863523
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonArrayInspector.java
 ---
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableListObjectInspector;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.Writable;
+
+/**
+ * The CarbonHiveArrayInspector will inspect an ArrayWritable, considering 
it as an Hive array.
+ * It can also inspect a List if Hive decides to inspect the result of an 
inspection.
+ */
+public class CarbonArrayInspector implements SettableListObjectInspector {
+
+  ObjectInspector arrayElementInspector;
+
+  public CarbonArrayInspector(final ObjectInspector arrayElementInspector) 
{
+this.arrayElementInspector = arrayElementInspector;
+  }
+
+  @Override
+  public String getTypeName() {
+return "array<" + arrayElementInspector.getTypeName() + ">";
+  }
+
+  @Override
+  public Category getCategory() {
+return Category.LIST;
+  }
+
+  @Override
+  public ObjectInspector getListElementObjectInspector() {
+return arrayElementInspector;
+  }
+
+  @Override
+  public Object getListElement(final Object data, final int index) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  if (index >= 0 && index < ((ArrayWritable) subObj).get().length) {
+return ((ArrayWritable) subObj).get()[index];
+  } else {
+return null;
+  }
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public int getListLength(final Object data) {
+if (data == null) {
+  return -1;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return -1;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return 0;
+  }
+
+  return ((ArrayWritable) subObj).get().length;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
+  }
+
+  @Override
+  public List getList(final Object data) {
+if (data == null) {
+  return null;
+}
+
+if (data instanceof ArrayWritable) {
+  final Writable[] listContainer = ((ArrayWritable) data).get();
+
+  if (listContainer == null || listContainer.length == 0) {
+return null;
+  }
+
+  final Writable subObj = listContainer[0];
+
+  if (subObj == null) {
+return null;
+  }
+
+  final Writable[] array = ((ArrayWritable) subObj).get();
+  final List list = new ArrayList();
+
+  for (final Writable obj : array) {
+list.add(obj);
+  }
+
+  return list;
+}
+
+throw new UnsupportedOperationException("Cannot inspect "
+  + data.getClass().getCanonicalName());
  

[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107894484
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java
 ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import org.apache.carbondata.core.scan.model.CarbonQueryPlan;
+import org.apache.carbondata.core.scan.model.QueryModel;
+import org.apache.carbondata.hadoop.CarbonInputFormat;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.hadoop.readsupport.CarbonReadSupport;
+import org.apache.carbondata.hadoop.util.CarbonInputFormatUtil;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.mapred.InputFormat;
+import org.apache.hadoop.mapred.InputSplit;
+import org.apache.hadoop.mapred.JobConf;
+import org.apache.hadoop.mapred.RecordReader;
+import org.apache.hadoop.mapred.Reporter;
+import org.apache.hadoop.mapreduce.Job;
+
+
+public class MapredCarbonInputFormat extends 
CarbonInputFormat
+implements InputFormat<Void, ArrayWritable>, 
CombineHiveInputFormat.AvoidSplitCombination {
+
+  @Override
+  public InputSplit[] getSplits(JobConf jobConf, int numSplits) throws 
IOException {
+org.apache.hadoop.mapreduce.JobContext jobContext = 
Job.getInstance(jobConf);
+List splitList = 
super.getSplits(jobContext);
+InputSplit[] splits = new InputSplit[splitList.size()];
+CarbonInputSplit split = null;
+for (int i = 0; i < splitList.size(); i++) {
+  split = (CarbonInputSplit) splitList.get(i);
+  splits[i] = new CarbonHiveInputSplit(split.getSegmentId(), 
split.getPath(),
+  split.getStart(), split.getLength(), split.getLocations(),
+  split.getNumberOfBlocklets(), split.getVersion(), 
split.getBlockStorageIdMap());
+}
+return splits;
+  }
+
+  @Override
+  public RecordReader<Void, ArrayWritable> getRecordReader(InputSplit 
inputSplit, JobConf jobConf,
+   Reporter 
reporter) throws IOException {
+QueryModel queryModel = getQueryModel(jobConf);
+CarbonReadSupport readSupport = 
getReadSupportClass(jobConf);
--- End diff --

need decode all dictionary columns and direct-dictionary columns.
Better to use SparkRowReadSupportImpl in spark1 module.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107862312
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonArrayInspector.java
 ---
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.SettableListObjectInspector;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.Writable;
+
+/**
+ * The CarbonHiveArrayInspector will inspect an ArrayWritable, considering 
it as an Hive array.
+ * It can also inspect a List if Hive decides to inspect the result of an 
inspection.
+ */
+public class CarbonArrayInspector implements SettableListObjectInspector {
+
+  ObjectInspector arrayElementInspector;
--- End diff --

add private.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #672: [CARBONDATA-815] add hive integratio...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/672#discussion_r107886267
  
--- Diff: 
integration/hive/src/main/java/org/apache/carbondata/hive/CarbonHiveSerDe.java 
---
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.hive;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Properties;
+import javax.annotation.Nullable;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hive.serde.serdeConstants;
+import org.apache.hadoop.hive.serde2.AbstractSerDe;
+import org.apache.hadoop.hive.serde2.SerDeException;
+import org.apache.hadoop.hive.serde2.SerDeSpec;
+import org.apache.hadoop.hive.serde2.SerDeStats;
+import org.apache.hadoop.hive.serde2.io.DoubleWritable;
+import org.apache.hadoop.hive.serde2.io.ShortWritable;
+import org.apache.hadoop.hive.serde2.objectinspector.ListObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.StructField;
+import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.DateObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.DoubleObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.HiveDecimalObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.IntObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.LongObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.ShortObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.StringObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.TimestampObjectInspector;
+import org.apache.hadoop.hive.serde2.typeinfo.StructTypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory;
+import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils;
+import org.apache.hadoop.io.ArrayWritable;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.Writable;
+
+
+/**
+ * A serde class for Carbondata.
+ * It transparently passes the object to/from the Carbon file 
reader/writer.
+ */
+@SerDeSpec(schemaProps = {serdeConstants.LIST_COLUMNS, 
serdeConstants.LIST_COLUMN_TYPES})
+public class CarbonHiveSerDe extends AbstractSerDe {
+  private SerDeStats stats;
+  private ObjectInspector objInspector;
+
+  private enum LAST_OPERATION {
+SERIALIZE,
+DESERIALIZE,
+UNKNOWN
+  }
+
+  private LAST_OPERATION status;
+  private long serializedSize;
+  private long deserializedSize;
+
+  public CarbonHiveSerDe() {
+stats = new SerDeStats();
+  }
+
+  @Override
+  public void initialize(@Nullable Configuration configuration, Properties 
tbl)
+  throws SerDeException {
+
+final TypeInfo rowTypeInfo;
+final List columnNames;
+final List columnTypes;
+// Get column names and sort order
+final String columnNameProperty = 
tbl.getProperty(serdeConstants.LIST_COLUMNS);
+final String columnTypeProperty = 
tbl.getProperty(serdeConstants.LIST_COLUMN_TYPES);
+
+if (columnNameProperty.length() == 0) {
+  columnNames = new ArrayList();
+} else {
+  columnNames = Arrays.asList(columnNameProperty.split(","));
+}
+if (columnTypeProperty.length() == 0) {
+  columnTypes = new ArrayList();
+} else {
+  columnTypes = 
TypeInfoUtils.getTypeInfosFro

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-25 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

2017-03-25 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/659#discussion_r108033699
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndex.java
 ---
@@ -16,30 +16,52 @@
  */
 package org.apache.carbondata.core.datastore.block;
 
+import java.util.HashMap;
 import java.util.List;
+import java.util.Map;
 
 import org.apache.carbondata.core.datastore.BTreeBuilderInfo;
 import org.apache.carbondata.core.datastore.BtreeBuilder;
 import org.apache.carbondata.core.datastore.impl.btree.BlockBTreeBuilder;
+import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.blocklet.DataFileFooter;
 
 /**
  * Class which is responsible for loading the b+ tree block. This class 
will
  * persist all the detail of a table segment
  */
 public class SegmentTaskIndex extends AbstractIndex {
+  private static Map<SegmentKey, SegmentProperties> 
segmentPropertiesCached =
--- End diff --

why not use TableSegmentUniqueIdentifier?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #659: [CARBONDATA-781] Store one SegmentPr...

2017-03-27 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/659#discussion_r108146058
  
--- Diff: 
core/src/test/java/org/apache/carbondata/core/datastore/block/SegmentTaskIndexTest.java
 ---
@@ -58,7 +58,9 @@
   @Mock public void build(BTreeBuilderInfo segmentBuilderInfos) {}
 };
 long numberOfRows = 100;
-SegmentTaskIndex segmentTaskIndex = new SegmentTaskIndex();
+SegmentProperties properties = new 
SegmentProperties(footerList.get(0).getColumnInTable(),
--- End diff --

should be after the initialization of variable footerList.
move to line 72


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #635: [CARBONDATA-782]support SORT_COLUMNS

2017-03-30 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/635
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #715: [CARBONDATA-782]support SORT_COLUMNS...

2017-03-30 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/715

[CARBONDATA-782]support SORT_COLUMNS 

The tasks of SORT_COLUMNS:

Support create table with sort_columns property.
e.g. tblproperties('sort_columns' = 'col7,col3')
The table with SORT_COLUMNS property will be sorted by SORT_COLUMNS. The 
order of columns is decided by SORT_COLUMNS.

Change the encoding rule of SORT_COLUMNS
Firstly, the rule of column encoding will keep consistent with previous.
Secondly, if a column of SORT_COLUMNS is a measure before, now this column 
will be created as a dimension. And this dimension is a no-dicitonary 
column(Better to use other direct-dictionary).
Thirdly, the dimension of SORT_COLUMNS have RLE and ROWID page, other 
dimension have only RLE(not sorted).

The start/end key should be composed of SORT_COLUMNS.
Using SORT_COLUMNS to build start/end key during data loading and select 
query.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata sort_columns

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/715.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #715


commit 287337170650e6ed19fb8d45e32df953d3a1d166
Author: QiangCai <qiang...@qq.com>
Date:   2017-03-02T09:48:54Z

sort columns




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #635: [CARBONDATA-782]support SORT_COLUMNS

2017-03-30 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/635
  
close this pr.
I will raise another pr to merge to 12-dev branch


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #635: [CARBONDATA-782]support SORT_COLUMNS

2017-03-30 Thread QiangCai
Github user QiangCai closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/635


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Store one SegmentPropertie...

2017-03-27 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/659
  
@kumarvishal09 dump picture is driver tree.
@watermen this pr only implement to reuse segment properties in driver 
side. can you try to do it in executor side?  About the building of executor 
side tree, please have a look AbstractQueryExecutor.initQuery and 
BlockIndexStore.getAll.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #715: [CARBONDATA-782]support SORT_COLUMNS

2017-03-31 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/715
  
done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #691: [CARBONDATA-783] Fixed message fails with o...

2017-03-24 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/691
  
Looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #691: [CARBONDATA-783] Fixed message fails with o...

2017-03-24 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/691
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

2017-03-24 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/696
  
@watermen 
It is unnecessary to store carbondata file path in carbonindex file.
During btree building, just use carbondata file name to sort 
tableblockinfos.
please check CarbonUtil.readCarbonIndexFile and TableBlockInfo.compareTo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #697: [CARBONDATA-708] Fixed Between and L...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/697#discussion_r107920260
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/CarbonFilters.scala
 ---
@@ -111,16 +111,24 @@ object CarbonFilters {
 }
 
 def getCarbonLiteralExpression(name: String, value: Any): 
CarbonExpression = {
-  new CarbonLiteralExpression(value,
-CarbonScalaUtil.convertSparkToCarbonDataType(dataTypeOf(name)))
+  val dataTypeOfAttribute = 
CarbonScalaUtil.convertSparkToCarbonDataType(dataTypeOf(name))
+  val dataType = if (Option(value).isDefined
+ && dataTypeOfAttribute == DataType.STRING
+ && value.isInstanceOf[Double]) {
+DataType.DOUBLE
--- End diff --

what's the reason to change datatype?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #697: [CARBONDATA-708] Fixed Between and L...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/697#discussion_r107919139
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/CarbonFilters.scala
 ---
@@ -111,16 +111,24 @@ object CarbonFilters {
 }
 
 def getCarbonLiteralExpression(name: String, value: Any): 
CarbonExpression = {
-  new CarbonLiteralExpression(value,
-CarbonScalaUtil.convertSparkToCarbonDataType(dataTypeOf(name)))
+  val dataTypeOfAttribute = 
CarbonScalaUtil.convertSparkToCarbonDataType(dataTypeOf(name))
+  val dataType = if (Option(value).isDefined
+ && dataTypeOfAttribute == DataType.STRING
+ && value.isInstanceOf[Double]) {
+DataType.DOUBLE
+  }
+  else {
--- End diff --

take care codestyle


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #697: [CARBONDATA-708] Fixed Between and L...

2017-03-24 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:


https://github.com/apache/incubator-carbondata/pull/697#discussion_r107918771
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/CarbonFilters.scala
 ---
@@ -111,16 +111,24 @@ object CarbonFilters {
 }
 
 def getCarbonLiteralExpression(name: String, value: Any): 
CarbonExpression = {
-  new CarbonLiteralExpression(value,
-CarbonScalaUtil.convertSparkToCarbonDataType(dataTypeOf(name)))
+  val dataTypeOfAttribute = 
CarbonScalaUtil.convertSparkToCarbonDataType(dataTypeOf(name))
+  val dataType = if (Option(value).isDefined
+ && dataTypeOfAttribute == DataType.STRING
+ && value.isInstanceOf[Double]) {
+DataType.DOUBLE
+  }
+  else {
+dataTypeOfAttribute
+  }
+  new CarbonLiteralExpression(value, dataType)
 }
 
 createFilter(predicate)
   }
 
 
   // Check out which filters can be pushed down to carbon, remaining can 
be handled in spark layer.
-  // Mostly dimension filters are only pushed down since it is faster in 
carbon.
+  // Mostly dimension filters are only pushed down since it is faster in 
carbo  n.
--- End diff --

redundant blank


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #694: [CARBONDATA-814] bad record log file writin...

2017-03-24 Thread QiangCai
Github user QiangCai commented on the issue:

https://github.com/apache/incubator-carbondata/pull/694
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >