[GitHub] carbondata pull request #2836: [CARBONDATA-3027] Increase unsafe working mem...

2018-10-18 Thread xubo245
Github user xubo245 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2836#discussion_r226548201
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1234,7 +1234,7 @@
 
   @CarbonProperty
   public static final String UNSAFE_WORKING_MEMORY_IN_MB = 
"carbon.unsafe.working.memory.in.mb";
-  public static final String UNSAFE_WORKING_MEMORY_IN_MB_DEFAULT = "512";
+  public static final String UNSAFE_WORKING_MEMORY_IN_MB_DEFAULT = "2048";
--- End diff --

I don't know.


---


[GitHub] carbondata pull request #2829: [CARBONDATA-3025]add more metadata in carbon ...

2018-10-18 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2829#discussion_r226546096
  
--- Diff: format/src/main/thrift/carbondata.thrift ---
@@ -206,6 +206,7 @@ struct FileFooter3{
 4: optional list blocklet_info_list3;   // Information 
about blocklets of all columns in this file for V3 format
 5: optional dictionary.ColumnDictionaryChunk dictionary; // Blocklet 
local dictionary
 6: optional bool is_sort; // True if the data is sorted in this file, 
it is used for compaction to decide whether to use merge sort or not
+7: optional map extra_info; // written by is used to 
write who wrote the file, it can be Aplication name, or SDK etc and version in 
which this carbondata file is written etc
--- End diff --

Since this is optional and we will set many extra information in the 
footer, I think we can provide a general interface to set and get this info, 
which means that we do not need to provide 'writtenBy' and 'setVersion' 
interface. Because following this pattern, the interfaces will become more and 
more.

In my opinion, we can only provide one interface setExtraInfo/getExtraInfo 
and it accepts/returns a map.
Moreover, this extraInfo is optional, which means you do not need to set it 
in all the tes tcases, you just need to focus your test case to avoid too many 
changes.


---


[GitHub] carbondata pull request #2836: [CARBONDATA-3027] Increase unsafe working mem...

2018-10-18 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2836#discussion_r226544482
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1234,7 +1234,7 @@
 
   @CarbonProperty
   public static final String UNSAFE_WORKING_MEMORY_IN_MB = 
"carbon.unsafe.working.memory.in.mb";
-  public static final String UNSAFE_WORKING_MEMORY_IN_MB_DEFAULT = "512";
+  public static final String UNSAFE_WORKING_MEMORY_IN_MB_DEFAULT = "2048";
--- End diff --

What will happen if user do not have that much memory?


---


[GitHub] carbondata issue #2835: [CARBONDATA-3029][Test] Fix errors in spark datasour...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2835
  
Build Success with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9141/



---


[GitHub] carbondata issue #2836: [CARBONDATA-3027] Increase unsafe working memory def...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2836
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1075/



---


[GitHub] carbondata issue #2836: [CARBONDATA-3027] Increase unsafe working memory def...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2836
  
Build Success with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9142/



---


[GitHub] carbondata issue #2835: [CARBONDATA-3029][Test] Fix errors in spark datasour...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2835
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1074/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1073/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Success with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9140/



---


[GitHub] carbondata issue #2836: [CARBONDATA-3027] Increase unsafe working memory def...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2836
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/877/



---


[GitHub] carbondata issue #2834: [CARBONDATA-3028][32k] Fix bugs in spark file format...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2834
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1072/



---


[GitHub] carbondata issue #2835: [CARBONDATA-3029][Test] Fix errors in spark datasour...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2835
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/876/



---


[GitHub] carbondata issue #2834: [CARBONDATA-3028][32k] Fix bugs in spark file format...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2834
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9139/



---


[GitHub] carbondata pull request #2836: [CARBONDATA-3027] Increase unsafe working mem...

2018-10-18 Thread xubo245
GitHub user xubo245 opened a pull request:

https://github.com/apache/carbondata/pull/2836

[CARBONDATA-3027] Increase unsafe working memory default size and add log 
file for SDK

[CARBONDATA-3027] Increase unsafe working memory default size and add log 
file for SDK

1.Increase unsafe working memory default size from 512m to 2048m
2.add log file for SDK

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 No
 - [ ] Any backward compatibility impacted?
 No
 - [ ] Document update required?
No
 - [ ] Testing done
 No
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
No


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xubo245/carbondata 
CARBONDATA-3027MemoryException

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2836.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2836


commit 90ea4ff1f73ac04cb035618a70b1458049fa0e86
Author: xubo245 
Date:   2018-10-19T03:50:40Z

[CARBONDATA-3027] Increase unsafe working memory default size and add log 
file for SDK
1.Increase unsafe working memory default size from 512m to 2048m
2.add log file for SDK




---


[GitHub] carbondata pull request #2835: [CARBONDATA-3029][Test] Fix errors in spark d...

2018-10-18 Thread xuchuanyin
GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/2835

[CARBONDATA-3029][Test] Fix errors in spark datasource tests in windows env

In current SparkCarbonDataSourceTest, the path specified in creating
table and writer in windows env looks like '\D:\xx\xx', this will cause
test failure such as "java.lang.IllegalArgumentException: Can not create
a Path from an empty string".

Here in this commit, we fixed this problem by normalizing the path and
convert the separator in path to unix style using carbon's FileFactory.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [x] Any interfaces changed?
 `NO`
 - [x] Any backward compatibility impacted?
  `NO`
 - [x] Document update required?
 `NO`
 - [x] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
`Updated tests`
- How it is tested? Please attach test report.
`Tested in local machine`
- Is it a performance related change? Please attach the performance 
test report.
`NA`
- Any additional information to help reviewers in testing this 
change.
`NA`
   
 - [x] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
`NA`


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata 
181019_bug_spark_ds_windows_error

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2835.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2835


commit 4ad528ef7fa289584d653c4dbb16c5c5a2b2f217
Author: xuchuanyin 
Date:   2018-10-19T03:42:30Z

Fix errors in spark datasource tests in windows env

In current SparkCarbonDataSourceTest, the path specified in creating
table and writer in windows env looks like '\D:\xx\xx', this will cause
test failure such as "java.lang.IllegalArgumentException: Can not create
a Path from an empty string".

Here in this commit, we fixed this problem by normalizing the path and
convert the separator in path to unix style using carbon's FileFactory.




---


[jira] [Created] (CARBONDATA-3029) Failed to run spark data source test cases in windows env

2018-10-18 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3029:
--

 Summary: Failed to run spark data source test cases in windows env
 Key: CARBONDATA-3029
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3029
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/875/



---


[GitHub] carbondata issue #2834: [CARBONDATA-3028][32k] Fix bugs in spark file format...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2834
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/874/



---


[GitHub] carbondata pull request #2834: [CARBONDATA-3028][32k] Fix bugs in spark file...

2018-10-18 Thread xuchuanyin
GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/2834

[CARBONDATA-3028][32k] Fix bugs in spark file format table with multiple 
longstringcolumns

If we create a spark file format table with multiple longstringcolumns
and the option long_string_columns contains blank characters, the
query on that table will fail, cause it didn't recognize the correct
varchar columns. The root cause is that carbondata didn't trim the blank
in long_string_columns while it recognizing the varchar columns.

Be sure to do all of the following checklist to help us incorporate 
your contribution quickly and easily:

 - [ ] Any interfaces changed?
 
 - [ ] Any backward compatibility impacted?
 
 - [ ] Document update required?

 - [ ] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
   
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata 
181019_bug_write_longstring_sparkfileformat

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2834.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2834


commit de9768d22a7f019e8019ba4e27d813e44795811f
Author: xuchuanyin 
Date:   2018-10-19T02:37:23Z

Fix bugs in spark file format table with blanks in longstringcolumns

If we create a spark file format table with multiple longstringcolumns
and the option long_string_columns contains blank characters, the
query on that table will fail, cause it didn't recognize the correct
varchar columns. The root cause is that carbondata didn't trim the blank
in long_string_columns while it recognizing the varchar columns.




---


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
LGTM


---


[jira] [Created] (CARBONDATA-3028) failed query spark file format table when there are blanks in long_string_columns

2018-10-18 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3028:
--

 Summary: failed query spark file format table when there are 
blanks in long_string_columns
 Key: CARBONDATA-3028
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3028
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata pull request #2824: [CARBONDATA-3008] Optimize default value for ...

2018-10-18 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2824#discussion_r226508384
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1371,16 +1371,15 @@
   public static final String CARBON_SECURE_DICTIONARY_SERVER_DEFAULT = 
"true";
 
   /**
-   * whether to use multi directories when loading data,
-   * the main purpose is to avoid single-disk-hot-spot
+   * whether to use yarn's local dir the main purpose is to avoid single 
disk hot spot
*/
   @CarbonProperty
-  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+  public static final String CARBON_USE_YARN_LOCAL_DIR = 
"carbon.use.local.dir";
--- End diff --

@jackylk This parameter is not new. If we changed the parameter name, for 
upgrade scenario, the user has to change the configured parameters as well.
So I think we should keep the old parameter name here...


---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1071/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9138/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/873/



---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226395410
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelRangeGrtThanFiterExecuterImpl.java
 ---
@@ -148,6 +148,61 @@ private void ifDefaultValueMatchesFilter() {
 return bitSet;
   }
 
+  @Override
+  public BitSet prunePages(RawBlockletColumnChunks rawBlockletColumnChunks)
+  throws FilterUnsupportedException, IOException {
--- End diff --

For all the RowLevelRangeFilters can we move some part of code to it's 
super class to remove code duplication??


---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226394853
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/scanner/impl/BlockletFilterScanner.java
 ---
@@ -316,4 +320,167 @@ private BlockletScannedResult 
executeFilter(RawBlockletColumnChunks rawBlockletC
 readTime.getCount() + dimensionReadTime);
 return scannedResult;
   }
+
+  /**
+   * This method will process the data in below order
+   * 1. first apply min max on the filter tree and check whether any of 
the filter
+   * is fall on the range of min max, if not then return empty result
+   * 2. If filter falls on min max range then apply filter on actual
+   * data and get the pruned pages.
+   * 3. if pruned pages are not empty then read only those blocks(measure 
or dimension)
+   * which was present in the query but not present in the filter, as 
while applying filter
+   * some of the blocks where already read and present in chunk holder so 
not need to
+   * read those blocks again, this is to avoid reading of same blocks 
which was already read
+   * 4. Set the blocks and filter pages to scanned result
+   *
+   * @param rawBlockletColumnChunks blocklet raw chunk of all columns
+   * @throws FilterUnsupportedException
+   */
+  private BlockletScannedResult executeFilterForPages(
+  RawBlockletColumnChunks rawBlockletColumnChunks)
+  throws FilterUnsupportedException, IOException {
+long startTime = System.currentTimeMillis();
+QueryStatistic totalBlockletStatistic = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.TOTAL_BLOCKLET_NUM);
+
totalBlockletStatistic.addCountStatistic(QueryStatisticsConstants.TOTAL_BLOCKLET_NUM,
+totalBlockletStatistic.getCount() + 1);
+// apply filter on actual data, for each page
+BitSet pages = this.filterExecuter.prunePages(rawBlockletColumnChunks);
+// if filter result is empty then return with empty result
+if (pages.isEmpty()) {
+  
CarbonUtil.freeMemory(rawBlockletColumnChunks.getDimensionRawColumnChunks(),
+  rawBlockletColumnChunks.getMeasureRawColumnChunks());
+
+  QueryStatistic scanTime = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+  .get(QueryStatisticsConstants.SCAN_BLOCKlET_TIME);
+  
scanTime.addCountStatistic(QueryStatisticsConstants.SCAN_BLOCKlET_TIME,
+  scanTime.getCount() + (System.currentTimeMillis() - startTime));
+
+  QueryStatistic scannedPages = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+  .get(QueryStatisticsConstants.PAGE_SCANNED);
+  scannedPages.addCountStatistic(QueryStatisticsConstants.PAGE_SCANNED,
+  scannedPages.getCount());
+  return createEmptyResult();
+}
+
+BlockletScannedResult scannedResult =
+new FilterQueryScannedResult(blockExecutionInfo, 
queryStatisticsModel);
+
+// valid scanned blocklet
+QueryStatistic validScannedBlockletStatistic = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.VALID_SCAN_BLOCKLET_NUM);
+validScannedBlockletStatistic
+
.addCountStatistic(QueryStatisticsConstants.VALID_SCAN_BLOCKLET_NUM,
+validScannedBlockletStatistic.getCount() + 1);
+// adding statistics for valid number of pages
+QueryStatistic validPages = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.VALID_PAGE_SCANNED);
+
validPages.addCountStatistic(QueryStatisticsConstants.VALID_PAGE_SCANNED,
+validPages.getCount() + pages.cardinality());
+QueryStatistic scannedPages = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.PAGE_SCANNED);
+scannedPages.addCountStatistic(QueryStatisticsConstants.PAGE_SCANNED,
+scannedPages.getCount() + pages.cardinality());
+// get the row indexes from bit set for each page
+int[] pageFilteredPages = new int[pages.cardinality()];
+int index = 0;
+for (int i = pages.nextSetBit(0); i >= 0; i = pages.nextSetBit(i + 1)) 
{
+  pageFilteredPages[index++] = i;
+}
+// count(*)  case there would not be any dimensions are measures 
selected.
+int[] numberOfRows = new int[pages.cardinality()];
+for (int i = 0; i < numberOfRows.length; i++) {
+  numberOfRows[i] = 
rawBlockletColumnChunks.getDataBlock().getPageRowCount(i);
+}
+long dimensionReadTime = System.currentTimeMillis();
+dimensionReadTime = System.currentTimeMillis() - dimensionReadTime;
+
--- End diff --

Please remove empty lines


---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226392086
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
 ---
@@ -179,6 +167,75 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
 return null;
   }
 
+  private boolean isScanRequired(DimensionRawColumnChunk 
dimensionRawColumnChunk, int i) {
+boolean scanRequired;
+// for no dictionary measure column comparison can be done
+// on the original data as like measure column
+if 
(DataTypeUtil.isPrimitiveColumn(dimColumnEvaluatorInfo.getDimension().getDataType())
+&& 
!dimColumnEvaluatorInfo.getDimension().hasEncoding(Encoding.DICTIONARY)) {
+  scanRequired = 
isScanRequired(dimensionRawColumnChunk.getMaxValues()[i],
+  dimensionRawColumnChunk.getMinValues()[i], 
dimColumnExecuterInfo.getFilterKeys(),
+  dimColumnEvaluatorInfo.getDimension().getDataType());
+} else {
+  scanRequired = 
isScanRequired(dimensionRawColumnChunk.getMaxValues()[i],
+dimensionRawColumnChunk.getMinValues()[i], 
dimColumnExecuterInfo.getFilterKeys(),
+dimensionRawColumnChunk.getMinMaxFlagArray()[i]);
+}
+return scanRequired;
+  }
+
+  @Override
+  public BitSet prunePages(RawBlockletColumnChunks rawBlockletColumnChunks)
+  throws FilterUnsupportedException, IOException {
+if (isDimensionPresentInCurrentBlock) {
+  int chunkIndex = 
segmentProperties.getDimensionOrdinalToChunkMapping()
+  .get(dimColumnEvaluatorInfo.getColumnIndex());
+  if (null == 
rawBlockletColumnChunks.getDimensionRawColumnChunks()[chunkIndex]) {
+rawBlockletColumnChunks.getDimensionRawColumnChunks()[chunkIndex] =
+rawBlockletColumnChunks.getDataBlock()
+
.readDimensionChunk(rawBlockletColumnChunks.getFileReader(), chunkIndex);
+  }
+  DimensionRawColumnChunk dimensionRawColumnChunk =
+  
rawBlockletColumnChunks.getDimensionRawColumnChunks()[chunkIndex];
+  filterValues = dimColumnExecuterInfo.getFilterKeys();
+  BitSet bitSet = new BitSet(dimensionRawColumnChunk.getPagesCount());
+  for (int i = 0; i < dimensionRawColumnChunk.getPagesCount(); i++) {
+if (dimensionRawColumnChunk.getMaxValues() != null) {
+  if (isScanRequired(dimensionRawColumnChunk, i)) {
+bitSet.set(i);
+  }
+} else {
+  bitSet.set(i);
+}
+  }
+  return bitSet;
+} else if (isMeasurePresentInCurrentBlock) {
+  int chunkIndex = segmentProperties.getMeasuresOrdinalToChunkMapping()
+  .get(msrColumnEvaluatorInfo.getColumnIndex());
+  if (null == 
rawBlockletColumnChunks.getMeasureRawColumnChunks()[chunkIndex]) {
+rawBlockletColumnChunks.getMeasureRawColumnChunks()[chunkIndex] =
+rawBlockletColumnChunks.getDataBlock()
+.readMeasureChunk(rawBlockletColumnChunks.getFileReader(), 
chunkIndex);
+  }
+  MeasureRawColumnChunk measureRawColumnChunk =
+  rawBlockletColumnChunks.getMeasureRawColumnChunks()[chunkIndex];
+  BitSet bitSet = new BitSet(measureRawColumnChunk.getPagesCount());
+  for (int i = 0; i < measureRawColumnChunk.getPagesCount(); i++) {
+if (measureRawColumnChunk.getMaxValues() != null) {
+  if (isScanRequired(measureRawColumnChunk.getMaxValues()[i],
+  measureRawColumnChunk.getMinValues()[i], 
msrColumnExecutorInfo.getFilterKeys(),
+  msrColumnEvaluatorInfo.getType())) {
+bitSet.set(i);
+  }
+} else {
+  bitSet.set(i);
+}
+  }
+  return bitSet;
+}
+return null;
--- End diff --

for dimension/measure column which is not present in current block 
returning null is ok??



---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226390252
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RangeValueFilterExecuterImpl.java
 ---
@@ -146,6 +146,44 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
 return applyNoAndDirectFilter(rawBlockletColumnChunks, 
useBitsetPipeLine);
   }
 
+  @Override
+  public BitSet prunePages(RawBlockletColumnChunks blockChunkHolder)
+  throws FilterUnsupportedException, IOException {
+// In case of Alter Table Add and Delete Columns the 
isDimensionPresentInCurrentBlock can be
+// false, in that scenario the default values of the column should be 
shown.
+// select all rows if dimension does not exists in the current block
+if (!isDimensionPresentInCurrentBlock) {
+  int i = blockChunkHolder.getDataBlock().numberOfPages();
--- End diff --

change i to numberOfPages


---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1070/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9137/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/872/



---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226344587
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
 ---
@@ -179,6 +167,75 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
 return null;
   }
 
+  private boolean isScanRequired(DimensionRawColumnChunk 
dimensionRawColumnChunk, int i) {
--- End diff --

please change i to columnIndex


---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226342877
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/filter/executer/ExcludeFilterExecuterImpl.java
 ---
@@ -143,6 +144,40 @@ public BitSetGroup applyFilter(RawBlockletColumnChunks 
rawBlockletColumnChunks,
 return null;
   }
 
+  @Override
+  public BitSet prunePages(RawBlockletColumnChunks rawBlockletColumnChunks)
+  throws FilterUnsupportedException, IOException {
+if (isDimensionPresentInCurrentBlock) {
+  int chunkIndex = 
segmentProperties.getDimensionOrdinalToChunkMapping()
+  .get(dimColEvaluatorInfo.getColumnIndex());
+  if (null == 
rawBlockletColumnChunks.getDimensionRawColumnChunks()[chunkIndex]) {
--- End diff --

For exclude filter case no need to read blocklet column data as every time 
we are returning true


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226223364
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/LazyColumnPage.java
 ---
@@ -91,6 +108,8 @@ public double getDouble(int rowId) {
 }
   }
 
+
+
--- End diff --

Remove the extra lines


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226188701
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/SafeFixedLengthDimensionDataChunkStore.java
 ---
@@ -30,9 +35,52 @@
*/
   private int columnValueSize;
 
-  public SafeFixedLengthDimensionDataChunkStore(boolean isInvertedIndex, 
int columnValueSize) {
+  private int numOfRows;
+
+  public SafeFixedLengthDimensionDataChunkStore(boolean isInvertedIndex, 
int columnValueSize,
+  int numOfRows) {
 super(isInvertedIndex);
 this.columnValueSize = columnValueSize;
+this.numOfRows = numOfRows;
+  }
+
+  @Override
+  public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, 
byte[] data,
+  ColumnVectorInfo vectorInfo) {
+CarbonColumnVector vector = vectorInfo.vector;
+fillVector(data, vectorInfo, vector);
+  }
+
+  private void fillVector(byte[] data, ColumnVectorInfo vectorInfo, 
CarbonColumnVector vector) {
+DataType dataType = vectorInfo.vector.getBlockDataType();
+if (dataType == DataTypes.DATE) {
+  for (int i = 0; i < numOfRows; i++) {
+int surrogateInternal =
+CarbonUtil.getSurrogateInternal(data, i * columnValueSize, 
columnValueSize);
+if (surrogateInternal == 1) {
+  vector.putNull(i);
+} else {
+  vector.putInt(i, surrogateInternal - 
DateDirectDictionaryGenerator.cutOffDate);
+}
+  }
+} else if (dataType == DataTypes.TIMESTAMP) {
+  for (int i = 0; i < numOfRows; i++) {
+int surrogateInternal =
+CarbonUtil.getSurrogateInternal(data, i * columnValueSize, 
columnValueSize);
+if (surrogateInternal == 1) {
--- End diff --

Replace 1 with `CarbonCommonConstants.MEMBER_DEFAULT_VAL_SURROGATE_KEY`


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226223232
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/LazyColumnPage.java
 ---
@@ -42,10 +43,26 @@ private LazyColumnPage(ColumnPage columnPage, 
ColumnPageValueConverter converter
 this.converter = converter;
   }
 
+  private LazyColumnPage(ColumnPage columnPage, ColumnPageValueConverter 
converter,
+  ColumnVectorInfo vectorInfo) {
+super(columnPage.getColumnPageEncoderMeta(), columnPage.getPageSize());
+this.columnPage = columnPage;
+this.converter = converter;
+if (columnPage instanceof DecimalColumnPage) {
+  vectorInfo.decimalConverter = ((DecimalColumnPage) 
columnPage).getDecimalConverter();
+}
+converter.decodeAndFillVector(columnPage, vectorInfo);
+  }
+
   public static ColumnPage newPage(ColumnPage columnPage, 
ColumnPageValueConverter codec) {
 return new LazyColumnPage(columnPage, codec);
   }
 
+  public static ColumnPage newPage(ColumnPage columnPage, 
ColumnPageValueConverter codec,
+  ColumnVectorInfo vectorInfo) {
+return new LazyColumnPage(columnPage, codec, vectorInfo);
+  }
--- End diff --

I am not sure what is the significance of making a static method and 
creating a object from it. As we are not doing anything extra in this method we 
can make the constructor itself public and remove this static method


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226188657
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/SafeFixedLengthDimensionDataChunkStore.java
 ---
@@ -30,9 +35,52 @@
*/
   private int columnValueSize;
 
-  public SafeFixedLengthDimensionDataChunkStore(boolean isInvertedIndex, 
int columnValueSize) {
+  private int numOfRows;
+
+  public SafeFixedLengthDimensionDataChunkStore(boolean isInvertedIndex, 
int columnValueSize,
+  int numOfRows) {
 super(isInvertedIndex);
 this.columnValueSize = columnValueSize;
+this.numOfRows = numOfRows;
+  }
+
+  @Override
+  public void fillVector(int[] invertedIndex, int[] invertedIndexReverse, 
byte[] data,
+  ColumnVectorInfo vectorInfo) {
+CarbonColumnVector vector = vectorInfo.vector;
+fillVector(data, vectorInfo, vector);
+  }
+
+  private void fillVector(byte[] data, ColumnVectorInfo vectorInfo, 
CarbonColumnVector vector) {
+DataType dataType = vectorInfo.vector.getBlockDataType();
+if (dataType == DataTypes.DATE) {
+  for (int i = 0; i < numOfRows; i++) {
+int surrogateInternal =
+CarbonUtil.getSurrogateInternal(data, i * columnValueSize, 
columnValueSize);
+if (surrogateInternal == 1) {
--- End diff --

Replace 1 with `CarbonCommonConstants.MEMBER_DEFAULT_VAL_SURROGATE_KEY`


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226299668
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveDeltaIntegralCodec.java
 ---
@@ -272,5 +293,164 @@ public double decodeDouble(double value) {
   // this codec is for integer type only
   throw new RuntimeException("internal error");
 }
+
+@Override
+public void decodeAndFillVector(ColumnPage columnPage, 
ColumnVectorInfo vectorInfo) {
+  CarbonColumnVector vector = vectorInfo.vector;
+  BitSet nullBits = columnPage.getNullBits();
+  DataType dataType = vector.getType();
+  DataType type = columnPage.getDataType();
+  int pageSize = columnPage.getPageSize();
+  BitSet deletedRows = vectorInfo.deletedRows;
+  fillVector(columnPage, vector, dataType, type, pageSize, vectorInfo);
+  if (deletedRows == null || deletedRows.isEmpty()) {
+for (int i = nullBits.nextSetBit(0); i >= 0; i = 
nullBits.nextSetBit(i + 1)) {
+  vector.putNull(i);
+}
+  }
+}
+
+private void fillVector(ColumnPage columnPage, CarbonColumnVector 
vector, DataType dataType,
+DataType type, int pageSize, ColumnVectorInfo vectorInfo) {
+  if (type == DataTypes.BOOLEAN || type == DataTypes.BYTE) {
+byte[] byteData = columnPage.getByteData();
+if (dataType == DataTypes.SHORT) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putShort(i, (short) (max - byteData[i]));
+  }
+} else if (dataType == DataTypes.INT) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putInt(i, (int) (max - byteData[i]));
+  }
+} else if (dataType == DataTypes.LONG) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putLong(i, (max - byteData[i]));
+  }
+} else if (dataType == DataTypes.TIMESTAMP) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putLong(i, (max - byteData[i]) * 1000);
+  }
+} else if (dataType == DataTypes.BOOLEAN) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putByte(i, (byte) (max - byteData[i]));
+  }
+} else if (DataTypes.isDecimal(dataType)) {
+  DecimalConverterFactory.DecimalConverter decimalConverter = 
vectorInfo.decimalConverter;
+  int precision = vectorInfo.measure.getMeasure().getPrecision();
+  for (int i = 0; i < pageSize; i++) {
+BigDecimal decimal = decimalConverter.getDecimal(max - 
byteData[i]);
+vector.putDecimal(i, decimal, precision);
+  }
+} else {
+  for (int i = 0; i < pageSize; i++) {
+vector.putDouble(i, (max - byteData[i]));
+  }
+}
+  } else if (type == DataTypes.SHORT) {
+short[] shortData = columnPage.getShortData();
+if (dataType == DataTypes.SHORT) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putShort(i, (short) (max - shortData[i]));
+  }
+} else if (dataType == DataTypes.INT) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putInt(i, (int) (max - shortData[i]));
+  }
+} else if (dataType == DataTypes.LONG) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putLong(i, (max - shortData[i]));
+  }
+}  else if (dataType == DataTypes.TIMESTAMP) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putLong(i, (max - shortData[i]) * 1000);
+  }
+} else if (DataTypes.isDecimal(dataType)) {
+  DecimalConverterFactory.DecimalConverter decimalConverter = 
vectorInfo.decimalConverter;
+  int precision = vectorInfo.measure.getMeasure().getPrecision();
+  for (int i = 0; i < pageSize; i++) {
+BigDecimal decimal = decimalConverter.getDecimal(max - 
shortData[i]);
+vector.putDecimal(i, decimal, precision);
+  }
+} else {
+  for (int i = 0; i < pageSize; i++) {
+vector.putDouble(i, (max - shortData[i]));
+  }
+}
+
+  } else if (type == DataTypes.SHORT_INT) {
+int[] shortIntData = columnPage.getShortIntData();
+if (dataType == DataTypes.INT) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putInt(i, (int) (max - shortIntData[i]));
+  }
+} else if (dataType == DataTypes.LONG) {
+  for (int i = 0; i < pageSize; i++) {
+vector.putLong(i, (ma

[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226317816
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/metadata/datatype/DecimalConverterFactory.java
 ---
@@ -173,6 +221,23 @@ public int getSizeInBytes() {
   return new BigDecimal(bigInteger, scale);
 }
 
+@Override public void fillVector(Object valuesToBeConverted, int size, 
ColumnVectorInfo info,
+BitSet nullBitset) {
+  CarbonColumnVector vector = info.vector;
+  int precision = info.measure.getMeasure().getPrecision();
+  if (valuesToBeConverted instanceof byte[][]) {
+byte[][] data = (byte[][]) valuesToBeConverted;
+for (int i = 0; i < size; i++) {
+  if (nullBitset.get(i)) {
+vector.putNull(i);
+  } else {
+BigInteger bigInteger = new BigInteger(data[i]);
+vector.putDecimal(i, new BigDecimal(bigInteger, scale), 
precision);
--- End diff --

here can we use the code `vector.putDecimal(i, 
DataTypeUtil.byteToBigDecimal(data[i]), precision)`. This is same as used i 
below fillVector method. If it can be used then we can refactor the code and 
create only 1 method


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226321251
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DirectPageWiseVectorFillResultCollector.java
 ---
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.collector.impl;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.BitSet;
+import java.util.List;
+
+import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryKeyGeneratorFactory;
+import org.apache.carbondata.core.metadata.encoder.Encoding;
+import org.apache.carbondata.core.mutate.DeleteDeltaVo;
+import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo;
+import org.apache.carbondata.core.scan.model.ProjectionDimension;
+import org.apache.carbondata.core.scan.model.ProjectionMeasure;
+import org.apache.carbondata.core.scan.result.BlockletScannedResult;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnarBatch;
+import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo;
+import 
org.apache.carbondata.core.scan.result.vector.MeasureDataVectorProcessor;
+
+/**
+ * It delegates the vector to fill the data directly from decoded pages.
+ */
+public class DirectPageWiseVectorFillResultCollector extends 
AbstractScannedResultCollector {
+
+  protected ProjectionDimension[] queryDimensions;
+
+  protected ProjectionMeasure[] queryMeasures;
+
+  private ColumnVectorInfo[] dictionaryInfo;
+
+  private ColumnVectorInfo[] noDictionaryInfo;
+
+  private ColumnVectorInfo[] complexInfo;
+
+  private ColumnVectorInfo[] measureColumnInfo;
+
+  ColumnVectorInfo[] allColumnInfo;
+
+  public DirectPageWiseVectorFillResultCollector(BlockExecutionInfo 
blockExecutionInfos) {
+super(blockExecutionInfos);
+// initialize only if the current block is not a restructured block 
else the initialization
+// will be taken care by RestructureBasedVectorResultCollector
+if (!blockExecutionInfos.isRestructuredBlock()) {
--- End diff --

`RestructureBasedVectorResultCollector` is extending from 
`DictionaryBasedVectorResultCollector`. So this check will not work here. 
Please check the scenario of restructured block in case of 
`DirectPageWiseVectorFillResultCollector`.  For restructure case I think the 
flow will not come here but you can decide whether to use directVectorFill flow 
for restructure case also


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread manishgupta88
Github user manishgupta88 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226329882
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DirectPageWiseVectorFillResultCollector.java
 ---
@@ -0,0 +1,181 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.core.scan.collector.impl;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.BitSet;
+import java.util.List;
+
+import 
org.apache.carbondata.core.keygenerator.directdictionary.DirectDictionaryKeyGeneratorFactory;
+import org.apache.carbondata.core.metadata.encoder.Encoding;
+import org.apache.carbondata.core.mutate.DeleteDeltaVo;
+import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo;
+import org.apache.carbondata.core.scan.model.ProjectionDimension;
+import org.apache.carbondata.core.scan.model.ProjectionMeasure;
+import org.apache.carbondata.core.scan.result.BlockletScannedResult;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnarBatch;
+import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo;
+import 
org.apache.carbondata.core.scan.result.vector.MeasureDataVectorProcessor;
+
+/**
+ * It delegates the vector to fill the data directly from decoded pages.
+ */
+public class DirectPageWiseVectorFillResultCollector extends 
AbstractScannedResultCollector {
--- End diff --

If feasible can this new class be avoided and introduced one more method in 
`DictionaryBasedVectorResultCollector` and from there call for direct 
conversing and filling of vector. This will also help in handling of other case 
like restructure scenarios which otherwise using the new class cannot be 
achieved directly.


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226334271
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/AbstractNonDictionaryVectorFiller.java
 ---
@@ -0,0 +1,274 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.chunk.store.impl.safe;
+
+import java.nio.ByteBuffer;
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.util.ByteUtil;
+import org.apache.carbondata.core.util.DataTypeUtil;
+
+public abstract class AbstractNonDictionaryVectorFiller {
+
+  protected int lengthSize;
+  protected int numberOfRows;
+
+  public AbstractNonDictionaryVectorFiller(int lengthSize, int 
numberOfRows) {
+this.lengthSize = lengthSize;
+this.numberOfRows = numberOfRows;
+  }
+
+  public abstract void fillVector(byte[] data, CarbonColumnVector vector, 
ByteBuffer buffer);
+
+  public int getLengthFromBuffer(ByteBuffer buffer) {
+return buffer.getShort();
+  }
+}
+
+class NonDictionaryVectorFillerFactory {
+
+  public static AbstractNonDictionaryVectorFiller getVectorFiller(DataType 
type, int lengthSize,
+  int numberOfRows) {
+if (type == DataTypes.STRING || type == DataTypes.VARCHAR) {
+  if (lengthSize == 2) {
+return new StringVectorFiller(lengthSize, numberOfRows);
+  } else {
+return new LongStringVectorFiller(lengthSize, numberOfRows);
+  }
+} else if (type == DataTypes.TIMESTAMP) {
+  return new TimeStampVectorFiller(lengthSize, numberOfRows);
+} else if (type == DataTypes.BOOLEAN) {
+  return new BooleanVectorFiller(lengthSize, numberOfRows);
+} else if (type == DataTypes.SHORT) {
+  return new ShortVectorFiller(lengthSize, numberOfRows);
+} else if (type == DataTypes.INT) {
+  return new IntVectorFiller(lengthSize, numberOfRows);
+} else if (type == DataTypes.LONG) {
+  return new LongStringVectorFiller(lengthSize, numberOfRows);
+}
+return new StringVectorFiller(lengthSize, numberOfRows);
+  }
+
+}
+
+class StringVectorFiller extends AbstractNonDictionaryVectorFiller {
+
+  public StringVectorFiller(int lengthSize, int numberOfRows) {
+super(lengthSize, numberOfRows);
+  }
+
+  @Override
+  public void fillVector(byte[] data, CarbonColumnVector vector, 
ByteBuffer buffer) {
+// start position will be used to store the current data position
+int startOffset = 0;
+int currentOffset = lengthSize;
+ByteUtil.UnsafeComparer comparer = ByteUtil.UnsafeComparer.INSTANCE;
+for (int i = 0; i < numberOfRows - 1; i++) {
+  buffer.position(startOffset);
--- End diff --

```suggestion
(((data[offset] & 0xFF) << 8) | (data[offset + 1] & 0xFF));
```
based on above comment we can update this logic for all inner classes 


---


[GitHub] carbondata pull request #2819: [CARBONDATA-3012] Added support for full scan...

2018-10-18 Thread kumarvishal09
Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2819#discussion_r226333633
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/AbstractNonDictionaryVectorFiller.java
 ---
@@ -0,0 +1,274 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.chunk.store.impl.safe;
+
+import java.nio.ByteBuffer;
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.util.ByteUtil;
+import org.apache.carbondata.core.util.DataTypeUtil;
+
+public abstract class AbstractNonDictionaryVectorFiller {
+
+  protected int lengthSize;
+  protected int numberOfRows;
+
+  public AbstractNonDictionaryVectorFiller(int lengthSize, int 
numberOfRows) {
+this.lengthSize = lengthSize;
+this.numberOfRows = numberOfRows;
+  }
+
+  public abstract void fillVector(byte[] data, CarbonColumnVector vector, 
ByteBuffer buffer);
--- End diff --

Instead of byte buffer we can directly pass byte[] and can get length from 
byte array based on length data type like for varchar 4 bytes and for others 2 
bytes 
```suggestion
  public abstract void fillVector(byte[] data, CarbonColumnVector vector, 
byte[] buffer);
```
(((data[offset] & 0xFF) << 8) | (data[offset + 1] & 0xFF));


---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226327006
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/MeasureRawColumnChunk.java
 ---
@@ -94,7 +95,7 @@ public ColumnPage decodeColumnPage(int pageNumber) {
   public ColumnPage convertToColumnPageWithOutCache(int index) {
 assert index < pagesCount;
 // in case of filter query filter columns blocklet pages will 
uncompressed
-// so no need to decode again
+// so no need to decodeAndFillVector again
--- End diff --

seems no need to modify


---


[GitHub] carbondata pull request #2820: [CARBONDATA-3013] Added support for pruning p...

2018-10-18 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2820#discussion_r226326803
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/DimensionRawColumnChunk.java
 ---
@@ -121,6 +122,22 @@ public DimensionColumnPage 
convertToDimColDataChunkWithOutCache(int index) {
 }
   }
 
+  /**
+   * Convert raw data with specified page number processed to 
DimensionColumnDataChunk and fill
+   * the vector
+   *
+   * @param pageNumber page number to decode and fill the vector
+   * @param vectorInfo vector to be filled with column page
+   */
+  public void convertToDimColDataChunkAndFillVector(int pageNumber, 
ColumnVectorInfo vectorInfo) {
+assert pageNumber < pagesCount;
+try {
+  chunkReader.decodeColumnPageAndFillVector(this, pageNumber, 
vectorInfo);
+} catch (Exception e) {
+  throw new RuntimeException(e);
--- End diff --

Now not throw underlying exception


---


[GitHub] carbondata pull request #2824: [CARBONDATA-3008] Optimize default value for ...

2018-10-18 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2824#discussion_r226325939
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1371,16 +1371,15 @@
   public static final String CARBON_SECURE_DICTIONARY_SERVER_DEFAULT = 
"true";
 
   /**
-   * whether to use multi directories when loading data,
-   * the main purpose is to avoid single-disk-hot-spot
+   * whether to use yarn's local dir the main purpose is to avoid single 
disk hot spot
*/
   @CarbonProperty
-  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+  public static final String CARBON_USE_YARN_LOCAL_DIR = 
"carbon.use.local.dir";
--- End diff --

```suggestion
  public static final String CARBON_USE_LOCAL_DIR = "carbon.use.local.dir";
```


---


[GitHub] carbondata pull request #2824: [CARBONDATA-3008] Optimize default value for ...

2018-10-18 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2824#discussion_r226325529
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1371,16 +1371,15 @@
   public static final String CARBON_SECURE_DICTIONARY_SERVER_DEFAULT = 
"true";
 
   /**
-   * whether to use multi directories when loading data,
-   * the main purpose is to avoid single-disk-hot-spot
+   * whether to use yarn's local dir the main purpose is to avoid single 
disk hot spot
*/
   @CarbonProperty
-  public static final String CARBON_USE_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
+  public static final String CARBON_USE_YARN_LOCAL_DIR = 
"carbon.use.local.dir";
--- End diff --

```suggestion
  public static final String CARBON_USE_YARN_LOCAL_DIR = 
"carbon.use.yarn.local.dir";
```


---


[GitHub] carbondata pull request #2818: [CARBONDATA-3011] Add carbon property to conf...

2018-10-18 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2818#discussion_r226324001
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/CarbonLateDecodeStrategy.scala
 ---
@@ -337,19 +340,35 @@ private[sql] class CarbonLateDecodeStrategy extends 
SparkStrategy {
 metadata,
 needDecoder,
 updateRequestedColumns.asInstanceOf[Seq[Attribute]])
-  filterCondition.map(execution.FilterExec(_, scan)).getOrElse(scan)
+  // Check whether spark should handle row filters in case of vector 
flow.
+  if (!vectorPushRowFilters && scan.isInstanceOf[CarbonDataSourceScan]
+  && !hasDictionaryFilterCols) {
+// Here carbon only do page pruning and row level pruning will be 
done by spark.
+scan.inputRDDs().head match {
+  case rdd: CarbonScanRDD[InternalRow] =>
+rdd.setDirectScanSupport(true)
+  case _ =>
+}
+
filterPredicates.reduceLeftOption(expressions.And).map(execution.FilterExec(_, 
scan))
+  .getOrElse(scan)
+  } else {
+filterCondition.map(execution.FilterExec(_, scan)).getOrElse(scan)
+  }
 } else {
 
   var newProjectList: Seq[Attribute] = Seq.empty
+  var implictsExisted = false
--- End diff --

why this is required, can you add comment


---


[GitHub] carbondata pull request #2818: [CARBONDATA-3011] Add carbon property to conf...

2018-10-18 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2818#discussion_r226323717
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonScanRDD.scala
 ---
@@ -228,9 +230,12 @@ class CarbonScanRDD[T: ClassTag](
   statistic.addStatistics(QueryStatisticsConstants.BLOCK_ALLOCATION, 
System.currentTimeMillis)
   statisticRecorder.recordStatisticsForDriver(statistic, queryId)
   statistic = new QueryStatistic()
-  val carbonDistribution = CarbonProperties.getInstance().getProperty(
+  var carbonDistribution = CarbonProperties.getInstance().getProperty(
 CarbonCommonConstants.CARBON_TASK_DISTRIBUTION,
 CarbonCommonConstants.CARBON_TASK_DISTRIBUTION_DEFAULT)
+  if (directScan) {
--- End diff --

I think there is too many path, if directScan can always save memory and 
copy less, I suggest we always use it, then no need for adding one more 
configuration


---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9136/



---


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
Build Success with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9135/



---


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1068/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1069/



---


[GitHub] carbondata pull request #2822: [CARBONDATA-3014] Added support for inverted ...

2018-10-18 Thread kunal642
Github user kunal642 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2822#discussion_r226301860
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/scanner/impl/BlockletFilterScanner.java
 ---
@@ -316,4 +320,167 @@ private BlockletScannedResult 
executeFilter(RawBlockletColumnChunks rawBlockletC
 readTime.getCount() + dimensionReadTime);
 return scannedResult;
   }
+
+  /**
+   * This method will process the data in below order
+   * 1. first apply min max on the filter tree and check whether any of 
the filter
+   * is fall on the range of min max, if not then return empty result
+   * 2. If filter falls on min max range then apply filter on actual
+   * data and get the pruned pages.
+   * 3. if pruned pages are not empty then read only those blocks(measure 
or dimension)
+   * which was present in the query but not present in the filter, as 
while applying filter
+   * some of the blocks where already read and present in chunk holder so 
not need to
+   * read those blocks again, this is to avoid reading of same blocks 
which was already read
+   * 4. Set the blocks and filter pages to scanned result
+   *
+   * @param rawBlockletColumnChunks blocklet raw chunk of all columns
+   * @throws FilterUnsupportedException
+   */
+  private BlockletScannedResult executeFilterForPages(
+  RawBlockletColumnChunks rawBlockletColumnChunks)
+  throws FilterUnsupportedException, IOException {
+long startTime = System.currentTimeMillis();
+QueryStatistic totalBlockletStatistic = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.TOTAL_BLOCKLET_NUM);
+
totalBlockletStatistic.addCountStatistic(QueryStatisticsConstants.TOTAL_BLOCKLET_NUM,
+totalBlockletStatistic.getCount() + 1);
+// apply filter on actual data, for each page
+BitSet pages = this.filterExecuter.prunePages(rawBlockletColumnChunks);
+// if filter result is empty then return with empty result
+if (pages.isEmpty()) {
+  
CarbonUtil.freeMemory(rawBlockletColumnChunks.getDimensionRawColumnChunks(),
+  rawBlockletColumnChunks.getMeasureRawColumnChunks());
+
+  QueryStatistic scanTime = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+  .get(QueryStatisticsConstants.SCAN_BLOCKlET_TIME);
+  
scanTime.addCountStatistic(QueryStatisticsConstants.SCAN_BLOCKlET_TIME,
+  scanTime.getCount() + (System.currentTimeMillis() - startTime));
+
+  QueryStatistic scannedPages = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+  .get(QueryStatisticsConstants.PAGE_SCANNED);
+  scannedPages.addCountStatistic(QueryStatisticsConstants.PAGE_SCANNED,
+  scannedPages.getCount());
+  return createEmptyResult();
+}
+
+BlockletScannedResult scannedResult =
+new FilterQueryScannedResult(blockExecutionInfo, 
queryStatisticsModel);
+
+// valid scanned blocklet
+QueryStatistic validScannedBlockletStatistic = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.VALID_SCAN_BLOCKLET_NUM);
+validScannedBlockletStatistic
+
.addCountStatistic(QueryStatisticsConstants.VALID_SCAN_BLOCKLET_NUM,
+validScannedBlockletStatistic.getCount() + 1);
+// adding statistics for valid number of pages
+QueryStatistic validPages = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.VALID_PAGE_SCANNED);
+
validPages.addCountStatistic(QueryStatisticsConstants.VALID_PAGE_SCANNED,
+validPages.getCount() + pages.cardinality());
+QueryStatistic scannedPages = 
queryStatisticsModel.getStatisticsTypeAndObjMap()
+.get(QueryStatisticsConstants.PAGE_SCANNED);
+scannedPages.addCountStatistic(QueryStatisticsConstants.PAGE_SCANNED,
+scannedPages.getCount() + pages.cardinality());
+// get the row indexes from bit set for each page
+int[] pageFilteredPages = new int[pages.cardinality()];
+int index = 0;
+for (int i = pages.nextSetBit(0); i >= 0; i = pages.nextSetBit(i + 1)) 
{
+  pageFilteredPages[index++] = i;
+}
+// count(*)  case there would not be any dimensions are measures 
selected.
+int[] numberOfRows = new int[pages.cardinality()];
+for (int i = 0; i < numberOfRows.length; i++) {
+  numberOfRows[i] = 
rawBlockletColumnChunks.getDataBlock().getPageRowCount(i);
--- End diff --

This will fill the numberofrows for the pages incorrectly. I think it 
should be 
for (int i = pages.nextSetBit(0); i >= 0; i = pages.nextSetBit(i + 1)) {
  pageFilteredPages[index] = i;
  numberOfRows[i

[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/871/



---


[GitHub] carbondata pull request #2831: [WIP] Remove unused declaration

2018-10-18 Thread jackylk
Github user jackylk closed the pull request at:

https://github.com/apache/carbondata/pull/2831


---


[GitHub] carbondata issue #2824: [CARBONDATA-3008] Optimize default value for multipl...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2824
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1064/



---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9134/



---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1066/



---


[jira] [Resolved] (CARBONDATA-3026) clear expired property that may cause GC problem

2018-10-18 Thread Jacky Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-3026.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> clear expired property that may cause GC problem
> 
>
> Key: CARBONDATA-3026
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3026
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> During data loading, we will write some temp files (sort temp
> files and temp fact data files) in some locations. In currently
> implementation, we will add the locations to the CarbonProperties and
> associated it with a special key that refers to the data loading.
> After data loading, the temp locations are cleared, but the added
> property is still remain in the CarbonProperties and never to be cleared.
> This will cause the CarbonProperties object growing bigger and bigger
> and lead to OOM problems if the thrift-server is a long time running
> service. A local test shows that after adding different properties for
> 11 Billion times, the OOM happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/869/



---


[GitHub] carbondata issue #2833: [CARBONDATA-3026] clear expired property that may ca...

2018-10-18 Thread jackylk
Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2833
  
LGTM


---


[GitHub] carbondata issue #2833: [CARBONDATA-3026] clear expired property that may ca...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2833
  
Build Success with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9130/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9129/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/870/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1061/



---


[GitHub] carbondata issue #2833: [CARBONDATA-3026] clear expired property that may ca...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2833
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1062/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1067/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9133/



---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/868/



---


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread zzcclp
Github user zzcclp commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
retest this please


---


[GitHub] carbondata pull request #2734: [CARBONDATA-2946] Unify conversion while writ...

2018-10-18 Thread dhatchayani
Github user dhatchayani closed the pull request at:

https://github.com/apache/carbondata/pull/2734


---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/867/



---


[GitHub] carbondata issue #2824: [CARBONDATA-3008] Optimize default value for multipl...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2824
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/866/



---


[GitHub] carbondata issue #2833: [CARBONDATA-3026] clear expired property that may ca...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2833
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/865/



---


[GitHub] carbondata pull request #2715: [CARBONDATA-2930] Support customize column co...

2018-10-18 Thread ravipesala
Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2715#discussion_r226274512
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataWithCompression.scala
 ---
@@ -42,6 +44,112 @@ case class Rcd(booleanField: Boolean, shortField: 
Short, intField: Int, bigintFi
 dateField: String, charField: String, floatField: Float, 
stringDictField: String,
 stringSortField: String, stringLocalDictField: String, 
longStringField: String)
 
+/**
+ * This compressor actually will not compress or decompress anything.
+ * It is used for test case of specifying customized compressor.
+ */
+class CustomizeCompressor extends Compressor {
+  override def getName: String = 
"org.apache.carbondata.integration.spark.testsuite.dataload.CustomizeCompressor"
--- End diff --

We no need to maintain the relationship between shortname and class name. 
user mentions short name in the implemented class and keeps the jar in 
classpath. Carbondata loads all classes which are implementing compressor class 
and loads the class and gets the shortname from that class.
So user only should give shortname in table properties and we store only 
shortname in thrift as well. This is just like spark datasource , please go 
through it once. It would be simple for user if we provide short name.


---


[GitHub] carbondata issue #2812: [CARBONDATA-3004][32k] Fix bugs in writing dataframe...

2018-10-18 Thread ajantha-bhat
Github user ajantha-bhat commented on the issue:

https://github.com/apache/carbondata/pull/2812
  
LGTM


---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1065/



---


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
Build Success with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9123/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread ravipesala
Github user ravipesala commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
retest this please


---


[GitHub] carbondata issue #2824: [CARBONDATA-3008] Optimize default value for multipl...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2824
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/864/



---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread manishnalla1994
Github user manishnalla1994 commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
retest this please


---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread kunal642
Github user kunal642 commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
retest this please


---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9127/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/863/



---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1057/



---


[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2829
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1058/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1060/



---


[GitHub] carbondata issue #2832: [CARBONDATA-3021][Streaming] Fix unsupported data ty...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2832
  
Build Failed with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1056/



---


[GitHub] carbondata issue #2833: [CARBONDATA-3026] clear expired property that may ca...

2018-10-18 Thread xuchuanyin
Github user xuchuanyin commented on the issue:

https://github.com/apache/carbondata/pull/2833
  
retest this please


---


[GitHub] carbondata issue #2791: [WIP][HOTFIX]correct the exception handling in looku...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2791
  
Build Success with Spark 2.2.1, Please check CI 
http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1055/



---


[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2829
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9124/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/862/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/861/



---


[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2829
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/860/



---


[GitHub] carbondata issue #2814: [WIP][CARBONDATA-3001] configurable page size in MB

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2814
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9128/



---


[GitHub] carbondata issue #2822: [CARBONDATA-3014] Added support for inverted index a...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2822
  
Build Failed  with Spark 2.3.1, Please check CI 
http://136.243.101.176:8080/job/carbondataprbuilder2.3/9125/



---


[GitHub] carbondata issue #2821: [CARBONDATA-3017] Map DDL Support

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2821
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/859/



---


[GitHub] carbondata issue #2829: [CARBONDATA-3025]add more metadata in carbon file fo...

2018-10-18 Thread CarbonDataQA
Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/2829
  
Build Success with Spark 2.1.0, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/858/



---


  1   2   >