[GitHub] incubator-carbondata issue #194: [CARBONDATA-270] Double data type value com...

2016-12-01 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/194
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #194: [CARBONDATA-270] Double data type value com...

2016-12-01 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/194
  
PR build status
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/733


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #336: [CARBONDATA-426] replace if else wit...

2016-12-01 Thread PallaviSingh1992
Github user PallaviSingh1992 commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/336#discussion_r90590085
  
--- Diff: 
integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/CarbonCompactionUtil.java
 ---
@@ -142,20 +142,14 @@ private static void 
groupCorrespodingInfoBasedOnTask(TableBlockInfo info,
* @return
*/
   public static boolean isCompactionRequiredForTable(String 
metaFolderPath) {
-String minorCompactionStatusFile = metaFolderPath + 
CarbonCommonConstants.FILE_SEPARATOR
-+ CarbonCommonConstants.minorCompactionRequiredFile;
-
-String majorCompactionStatusFile = metaFolderPath + 
CarbonCommonConstants.FILE_SEPARATOR
-+ CarbonCommonConstants.majorCompactionRequiredFile;
+String statusFile = metaFolderPath + 
CarbonCommonConstants.FILE_SEPARATOR;
 try {
-  if (FileFactory.isFileExist(minorCompactionStatusFile,
-  FileFactory.getFileType(minorCompactionStatusFile)) || 
FileFactory
-  .isFileExist(majorCompactionStatusFile,
-  FileFactory.getFileType(majorCompactionStatusFile))) {
-return true;
-  }
+  return (FileFactory.isFileExist(statusFile + 
CarbonCommonConstants.minorCompactionRequiredFile,
--- End diff --

I will revert back to original code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90588455
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/InputProcessorStepImpl.java
 ---
@@ -122,24 +139,52 @@ private boolean internalHasNext() {
   if (!hasNext) {
 // Check next iterator is available in the list.
 if (counter < inputIterators.size()) {
+  // close the old iterator
+  currentIterator.close();
   // Get the next iterator from the list.
   currentIterator = inputIterators.get(counter++);
+  // Initialize the new iterator
+  currentIterator.initialize();
   hasNext = internalHasNext();
 }
   }
   return hasNext;
 }
 
-@Override
-public CarbonRowBatch next() {
-  // Create batch and fill it.
-  CarbonRowBatch carbonRowBatch = new CarbonRowBatch();
-  int count = 0;
-  while (internalHasNext() && count < batchSize) {
-carbonRowBatch.addRow(new 
CarbonRow(rowParser.parseRow(currentIterator.next(;
-count++;
+@Override public CarbonRowBatch next() {
+  CarbonRowBatch result = null;
+  try {
+if (future == null) {
+  future = getCarbonRowBatch();
+}
+result = future.get();
+nextBatch = false;
+if (hasNext()) {
+  nextBatch = true;
+  future = getCarbonRowBatch();
+} else {
+  currentIterator.close();
+}
+  } catch (Exception e) {
--- End diff --

cache InterruptedException, ExecutionException only


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90588425
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/InputProcessorStepImpl.java
 ---
@@ -122,24 +139,52 @@ private boolean internalHasNext() {
   if (!hasNext) {
 // Check next iterator is available in the list.
 if (counter < inputIterators.size()) {
+  // close the old iterator
+  currentIterator.close();
   // Get the next iterator from the list.
   currentIterator = inputIterators.get(counter++);
+  // Initialize the new iterator
+  currentIterator.initialize();
   hasNext = internalHasNext();
 }
   }
   return hasNext;
 }
 
-@Override
-public CarbonRowBatch next() {
-  // Create batch and fill it.
-  CarbonRowBatch carbonRowBatch = new CarbonRowBatch();
-  int count = 0;
-  while (internalHasNext() && count < batchSize) {
-carbonRowBatch.addRow(new 
CarbonRow(rowParser.parseRow(currentIterator.next(;
-count++;
+@Override public CarbonRowBatch next() {
+  CarbonRowBatch result = null;
+  try {
--- End diff --

limit the try scope to `future.get` only


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90588408
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/InputProcessorStepImpl.java
 ---
@@ -122,24 +139,52 @@ private boolean internalHasNext() {
   if (!hasNext) {
 // Check next iterator is available in the list.
 if (counter < inputIterators.size()) {
+  // close the old iterator
+  currentIterator.close();
   // Get the next iterator from the list.
   currentIterator = inputIterators.get(counter++);
+  // Initialize the new iterator
+  currentIterator.initialize();
   hasNext = internalHasNext();
 }
   }
   return hasNext;
 }
 
-@Override
-public CarbonRowBatch next() {
-  // Create batch and fill it.
-  CarbonRowBatch carbonRowBatch = new CarbonRowBatch();
-  int count = 0;
-  while (internalHasNext() && count < batchSize) {
-carbonRowBatch.addRow(new 
CarbonRow(rowParser.parseRow(currentIterator.next(;
-count++;
+@Override public CarbonRowBatch next() {
--- End diff --

put override to previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90588348
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/InputProcessorStepImpl.java
 ---
@@ -122,24 +139,52 @@ private boolean internalHasNext() {
   if (!hasNext) {
 // Check next iterator is available in the list.
 if (counter < inputIterators.size()) {
+  // close the old iterator
+  currentIterator.close();
   // Get the next iterator from the list.
   currentIterator = inputIterators.get(counter++);
+  // Initialize the new iterator
+  currentIterator.initialize();
   hasNext = internalHasNext();
 }
   }
   return hasNext;
 }
 
-@Override
-public CarbonRowBatch next() {
-  // Create batch and fill it.
-  CarbonRowBatch carbonRowBatch = new CarbonRowBatch();
-  int count = 0;
-  while (internalHasNext() && count < batchSize) {
-carbonRowBatch.addRow(new 
CarbonRow(rowParser.parseRow(currentIterator.next(;
-count++;
+@Override public CarbonRowBatch next() {
+  CarbonRowBatch result = null;
+  try {
--- End diff --

put override to previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90588344
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/InputProcessorStepImpl.java
 ---
@@ -80,40 +87,50 @@ public void initialize() throws 
CarbonDataLoadingException {
 return iterators;
   }
 
-  @Override
-  protected CarbonRow processRow(CarbonRow row) {
+  @Override protected CarbonRow processRow(CarbonRow row) {
 return null;
   }
 
+  @Override public void close() {
+executorService.shutdown();
+  }
+
   /**
* This iterator wraps the list of iterators and it starts iterating the 
each
* iterator of the list one by one. It also parse the data while 
iterating it.
*/
   private static class InputProcessorIterator extends 
CarbonIterator {
 
-private List> inputIterators;
+private List> inputIterators;
 
-private Iterator currentIterator;
+private InputIterator currentIterator;
 
 private int counter;
 
 private int batchSize;
 
 private RowParser rowParser;
 
-public InputProcessorIterator(List> inputIterators,
-RowParser rowParser, int batchSize) {
+private Future future;
+
+private ExecutorService executorService;
+
+private boolean nextBatch = false;
+
+public InputProcessorIterator(List> 
inputIterators,
+RowParser rowParser, int batchSize, ExecutorService 
executorService) {
   this.inputIterators = inputIterators;
   this.batchSize = batchSize;
   this.rowParser = rowParser;
   this.counter = 0;
   // Get the first iterator from the list.
   currentIterator = inputIterators.get(counter++);
+  currentIterator.initialize();
+  this.executorService = executorService;
 }
 
-@Override
-public boolean hasNext() {
-  return internalHasNext();
+@Override public boolean hasNext() {
--- End diff --

put override to previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90588312
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/InputProcessorStepImpl.java
 ---
@@ -80,40 +87,50 @@ public void initialize() throws 
CarbonDataLoadingException {
 return iterators;
   }
 
-  @Override
-  protected CarbonRow processRow(CarbonRow row) {
+  @Override protected CarbonRow processRow(CarbonRow row) {
 return null;
   }
 
+  @Override public void close() {
+executorService.shutdown();
+  }
+
   /**
* This iterator wraps the list of iterators and it starts iterating the 
each
* iterator of the list one by one. It also parse the data while 
iterating it.
*/
   private static class InputProcessorIterator extends 
CarbonIterator {
 
-private List> inputIterators;
+private List> inputIterators;
 
-private Iterator currentIterator;
+private InputIterator currentIterator;
 
 private int counter;
 
 private int batchSize;
 
 private RowParser rowParser;
 
-public InputProcessorIterator(List> inputIterators,
-RowParser rowParser, int batchSize) {
+private Future future;
+
+private ExecutorService executorService;
+
+private boolean nextBatch = false;
--- End diff --

initialize in constructor, like counter


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #333: [CARBONDATA-471]Optimized no kettle ...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/333#discussion_r90587816
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/csv/CSVInputFormat.java ---
@@ -138,6 +140,17 @@ public static void setQuoteCharacter(String 
quoteCharacter, Configuration config
   }
 
   /**
+   * Sets the read buffer size to configuration.
+   * @param bufferSize
+   * @param configuration
+   */
+  public static void setReadBufferSize(String bufferSize, Configuration 
configuration) {
--- End diff --

why bufferSize is string but not int?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-481) [SPARK2]fix late decoder and support whole stage code gen

2016-12-01 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-481.
-
Resolution: Fixed
  Assignee: QiangCai

> [SPARK2]fix late decoder and support whole stage code gen
> -
>
> Key: CARBONDATA-481
> URL: https://issues.apache.org/jira/browse/CARBONDATA-481
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 0.2.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 0.3.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #379: [CARBONDATA-481][SPARK2]fix late dec...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/379


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #379: [CARBONDATA-481][SPARK2]fix late decoder an...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/379
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #379: [CARBONDATA-481][SPARK2]fix late decoder an...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/379
  
CI
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/732/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #380: Fix compatibility

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/380


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #380: Fix compatibility

2016-12-01 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/incubator-carbondata/pull/380

Fix compatibility



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata comp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/380.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #380


commit 7904716b9396b9ba660e4cb08ef0cba1821f3166
Author: jackylk 
Date:   2016-12-02T04:10:58Z

fix compatibility




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #336: [CARBONDATA-426] replace if else with condi...

2016-12-01 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/336
  
Please rebase the code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #194: [CARBONDATA-270] Double data type va...

2016-12-01 Thread sujith71955
Github user sujith71955 commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/194#discussion_r90505720
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java ---
@@ -411,4 +411,24 @@ private static String parseStringToBigDecimal(String 
value, CarbonDimension dime
 }
 return null;
   }
+  /**
+   * This method will compare double values it will preserve
+   * the -0.0 and 0.0 equality as per == ,also preserve NaN equality check 
as per
+   * java.lang.Double.equals()
+   *
+   * @param d1 double value for equality check
+   * @param d2 double value for equality check
+   * @return boolean after comparing two double values.
+   */
+  public static int compareDoubleWithNan(Double d1, Double d2) {
+if ((d1.doubleValue() == d2.doubleValue()) || (Double.isNaN(d1) && 
Double.isNaN(d2))) {
+  return 0;
+}
+else if (d1 < d2) {
+  return -1;
+}
+else  {
--- End diff --

yes, i think we can remove the  unnecessary else  statement itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #378: [CARBONDATA-480] Add file format ver...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/378


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-480) Add file format version enum

2016-12-01 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-480.

Resolution: Fixed
  Assignee: Jacky Li

> Add file format version enum
> 
>
> Key: CARBONDATA-480
> URL: https://issues.apache.org/jira/browse/CARBONDATA-480
> Project: CarbonData
>  Issue Type: Improvement
>Affects Versions: 0.2.0-incubating
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 0.3.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Add file format version enum instead of using short value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #378: [CARBONDATA-480] Add file format version en...

2016-12-01 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/incubator-carbondata/pull/378
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #378: [CARBONDATA-480] Add file format version en...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/378
  
CI passed
http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/730/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #366: [CARBONDATA-368]Insert into carbon t...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/366


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-480) Add file format version enum

2016-12-01 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-480:
---

 Summary: Add file format version enum
 Key: CARBONDATA-480
 URL: https://issues.apache.org/jira/browse/CARBONDATA-480
 Project: CarbonData
  Issue Type: Improvement
Affects Versions: 0.2.0-incubating
Reporter: Jacky Li
 Fix For: 0.3.0-incubating


Add file format version enum instead of using short value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata pull request #378: [CARBONDATA-480] Add file format ver...

2016-12-01 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/incubator-carbondata/pull/378

[CARBONDATA-480] Add file format version enum

Add file format version enum instead of using short value

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata fixversion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/378.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #378


commit d9413c7e13a5e9e15c1d96b99598a96ab7da5979
Author: jackylk 
Date:   2016-12-01T15:02:16Z

add file format version enum




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (CARBONDATA-479) Guarantee consistency for keyword LOCAL and file path in data loading command

2016-12-01 Thread Lionx (JIRA)
Lionx created CARBONDATA-479:


 Summary: Guarantee consistency for keyword LOCAL and file path in 
data loading command
 Key: CARBONDATA-479
 URL: https://issues.apache.org/jira/browse/CARBONDATA-479
 Project: CarbonData
  Issue Type: Bug
Reporter: Lionx
Priority: Minor


In CarbonSqlParser.scala,
protected lazy val loadDataNew: Parser[LogicalPlan] =
LOAD ~> DATA ~> opt(LOCAL) ~> INPATH ~> stringLit ~ opt(OVERWRITE) ~
(INTO ~> TABLE ~> (ident <~ ".").? ~ ident) ~
(OPTIONS ~> "(" ~> repsep(loadOptions, ",") <~ ")").? <~ opt(";") ^^ {
  case filePath ~ isOverwrite ~ table ~ optionsList =>
val (databaseNameOp, tableName) = table match {
  case databaseName ~ tableName => (databaseName, 
tableName.toLowerCase())
}
if (optionsList.isDefined) {
  validateOptions(optionsList)
}
val optionsMap = optionsList.getOrElse(List.empty[(String, 
String)]).toMap
LoadTable(databaseNameOp, tableName, filePath, Seq(), optionsMap,
  isOverwrite.isDefined)
}

It seems that using Keyword LOCAL impacts noting. Loading data from hdfs or 
file just depends on the path.  





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #366: [CARBONDATA-368]Insert into carbon table fe...

2016-12-01 Thread ashokblend
Github user ashokblend commented on the issue:

https://github.com/apache/incubator-carbondata/pull/366
  
rebase done. please merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #377: [CARBONDATA-478][SPARK2]Spark2 modul...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/377


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #377: [CARBONDATA-478]Spark2 module should have d...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/377
  
LGTM



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #377: [CARBONDATA-478]Spark2 module should...

2016-12-01 Thread QiangCai
GitHub user QiangCai opened a pull request:

https://github.com/apache/incubator-carbondata/pull/377

[CARBONDATA-478]Spark2 module should have different SparkRowReadSupportImpl 
with spark1



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/QiangCai/incubator-carbondata master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/377.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #377


commit 3bc55a38c5d645ca1e07381910692ac0b2bb6297
Author: QiangCai 
Date:   2016-12-01T11:32:04Z

fixLatedecoderIssueForSpark2




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #304: [WIP]Fixed issue with coalasce function.

2016-12-01 Thread ashokblend
Github user ashokblend commented on the issue:

https://github.com/apache/incubator-carbondata/pull/304
  
Closing this, as its not required here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #304: [WIP]Fixed issue with coalasce funct...

2016-12-01 Thread ashokblend
Github user ashokblend closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/304


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (CARBONDATA-458) Improving carbon first time query performance

2016-12-01 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-458.
-
   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

>  Improving carbon first time query performance
> --
>
> Key: CARBONDATA-458
> URL: https://issues.apache.org/jira/browse/CARBONDATA-458
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, data-load, data-query
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 0.3.0-incubating
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Improving carbon first time query performance
> Reason:
> 1. As file system cache is cleared file reading will make it slower to read 
> and cache
> 2. In first time query carbon will have to read the footer from file data 
> file to form the btree
> 3. Carbon reading more footer data than its required(data chunk)
> 4. There are lots of random seek is happening in carbon as column data(data 
> page, rle, inverted index) are not stored together.
> Solution: 
> 1. Improve block loading time. This can be done by removing data chunk from 
> blockletInfo and storing only offset and length of data chunk
> 2. compress presence meta bitset stored for null values for measure column 
> using snappy 
> 3. Store the metadata and data of a column together and read together this 
> reduces random seek and improve IO



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-100) BigInt compression

2016-12-01 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-100.
-
   Resolution: Fixed
 Assignee: Ashok Kumar
Fix Version/s: 0.3.0-incubating

> BigInt compression
> --
>
> Key: CARBONDATA-100
> URL: https://issues.apache.org/jira/browse/CARBONDATA-100
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ashok Kumar
>Assignee: Ashok Kumar
> Fix For: 0.3.0-incubating
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In Carbon bigint is stored as long. There is no compression done on data.
> Change is required to do compression on bigint data as we do for double



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #338: [CARBONDATA-100]Implement BigInt value comp...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/338
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #338: [CARBONDATA-100]Implement BigInt val...

2016-12-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/338


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #338: [CARBONDATA-100]Implement BigInt value comp...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/338
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #338: [CARBONDATA-100]Implement BigInt value comp...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/338
  
Thanks for working for this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #338: [CARBONDATA-100]Implement BigInt value comp...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/338
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (CARBONDATA-458) Improving carbon first time query performance

2016-12-01 Thread Jacky Li (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711514#comment-15711514
 ] 

Jacky Li commented on CARBONDATA-458:
-

This work is not about merging footers into a central file, it is about 
re-orgnaizing the internal structure of carbon file to make it faster when 
doing the first time query. I think the biggest bottle net is the 3rd and 4th 
of those Vishal has pointed out.

3. Carbon reading more footer data than its required(data chunk)
4. There are lots of random seek is happening in carbon as column data(data 
page, rle, inverted index) are not stored together.

>  Improving carbon first time query performance
> --
>
> Key: CARBONDATA-458
> URL: https://issues.apache.org/jira/browse/CARBONDATA-458
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, data-load, data-query
>Reporter: kumar vishal
>Assignee: kumar vishal
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Improving carbon first time query performance
> Reason:
> 1. As file system cache is cleared file reading will make it slower to read 
> and cache
> 2. In first time query carbon will have to read the footer from file data 
> file to form the btree
> 3. Carbon reading more footer data than its required(data chunk)
> 4. There are lots of random seek is happening in carbon as column data(data 
> page, rle, inverted index) are not stored together.
> Solution: 
> 1. Improve block loading time. This can be done by removing data chunk from 
> blockletInfo and storing only offset and length of data chunk
> 2. compress presence meta bitset stored for null values for measure column 
> using snappy 
> 3. Store the metadata and data of a column together and read together this 
> reduces random seek and improve IO



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-carbondata issue #265: [CARBONDATA-458]Improving First time query ...

2016-12-01 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/incubator-carbondata/pull/265
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #376: [WIP]TO support insert 1 line into c...

2016-12-01 Thread Zhangshunyu
Github user Zhangshunyu closed the pull request at:

https://github.com/apache/incubator-carbondata/pull/376


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata issue #265: [CARBONDATA-458]Improving First time query ...

2016-12-01 Thread kumarvishal09
Github user kumarvishal09 commented on the issue:

https://github.com/apache/incubator-carbondata/pull/265
  

http://136.243.101.176:8080/job/ApacheCarbonManualPRBuilder/org.apache.carbondata$carbondata-spark/719/testReport/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #376: [WIP]TO support insert 1 line into c...

2016-12-01 Thread Zhangshunyu
GitHub user Zhangshunyu opened a pull request:

https://github.com/apache/incubator-carbondata/pull/376

[WIP]TO support insert 1 line into carbon table.

WIP

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Zhangshunyu/incubator-carbondata insert1line

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-carbondata/pull/376.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #376


commit 43c363315ff980f02fe060ad0c25a4d028d463f7
Author: Zhangshunyu 
Date:   2016-12-01T08:20:04Z

To support insert into one line

commit 600fc29e24c9766a63f239e543ca23ead53c235e
Author: Zhangshunyu 
Date:   2016-12-01T08:20:59Z

To support insert into one line

commit 3b0b0bc16ff1ffe0e59d338ec94d36a3317e4a1e
Author: Zhangshunyu 
Date:   2016-12-01T08:24:18Z

To support insert into one line




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #265: [CARBONDATA-458]Improving First time...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/265#discussion_r90400739
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -340,17 +341,17 @@ private Expression getFilterPredicates(Configuration 
configuration) {
   }
   resultFilterredBlocks.addAll(filterredBlocks);
 }
-statistic.addStatistics(QueryStatisticsConstants.LOAD_BLOCKS_DRIVER,
-System.currentTimeMillis());
+statistic
+.addStatistics(QueryStatisticsConstants.LOAD_BLOCKS_DRIVER, 
System.currentTimeMillis());
--- End diff --

no need to 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #265: [CARBONDATA-458]Improving First time...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/265#discussion_r90400632
  
--- Diff: 
hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonInputFormat.java ---
@@ -193,8 +203,7 @@ public static void setSegmentsToAccess(Configuration 
configuration, List
* @return List list of CarbonInputSplit
* @throws IOException
*/
-  @Override
-  public List getSplits(JobContext job) throws IOException {
+  @Override public List getSplits(JobContext job) throws 
IOException {
--- End diff --

move `Override` to previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #265: [CARBONDATA-458]Improving First time...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/265#discussion_r90399963
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -814,85 +810,86 @@
* Rocord size in case of compaction.
*/
   public static final int COMPACTION_INMEMORY_RECORD_SIZE = 12;
-
-  /**
-   * If the level 2 compaction is done in minor then new compacted segment 
will end with .2
-   */
-  public static String LEVEL2_COMPACTION_INDEX = ".2";
-
-  /**
-   * Indicates compaction
-   */
-  public static String COMPACTION_KEY_WORD = "COMPACTION";
-
   /**
* hdfs temporary directory key
*/
   public static final String HDFS_TEMP_LOCATION = "hadoop.tmp.dir";
-
   /**
* zookeeper url key
*/
   public static final String ZOOKEEPER_URL = "spark.deploy.zookeeper.url";
-
   /**
* configure the minimum blocklet size eligible for blocklet distribution
*/
   public static final String CARBON_BLOCKLETDISTRIBUTION_MIN_REQUIRED_SIZE 
=
   "carbon.blockletdistribution.min.blocklet.size";
-
   /**
* default blocklet size eligible for blocklet distribution
*/
   public static final int 
DEFAULT_CARBON_BLOCKLETDISTRIBUTION_MIN_REQUIRED_SIZE = 2;
-
+  /**
+   * This batch size is used to send rows from load step to another step 
in batches.
+   */
+  public static final String DATA_LOAD_BATCH_SIZE = "DATA_LOAD_BATCH_SIZE";
+  /**
+   * Default size of data load batch size.
+   */
+  public static final String DATA_LOAD_BATCH_SIZE_DEFAULT = "1000";
+  /**
+   * carbon data file version property
+   */
+  public static final String CARBON_DATA_FILE_VERSION = 
"carbon.data.file.version";
+  /**
+   * current data file version
+   */
+  public static final short CARBON_DATA_FILE_CURRENT_VERSION = 2;
--- End diff --

change 2 to enum also


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata pull request #338: [CARBONDATA-100]Implement BigInt val...

2016-12-01 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/incubator-carbondata/pull/338#discussion_r90398660
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/ValueCompressionUtil.java ---
@@ -243,6 +261,20 @@ public static Object 
getCompressedValues(COMPRESSION_TYPE compType, long[] value
 }
   }
 
+  /**
+   *
--- End diff --

please describe this function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---