[GitHub] carbondata issue #2974: [CARBONDATA-2563][CATALYST] Explain query with Order...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2974 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1717/ ---
[jira] [Created] (CARBONDATA-3165) Query of BloomFilter java.lang.NullPointerException
Chenjian Qiu created CARBONDATA-3165: Summary: Query of BloomFilter java.lang.NullPointerException Key: CARBONDATA-3165 URL: https://issues.apache.org/jira/browse/CARBONDATA-3165 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 1.5.1 Reporter: Chenjian Qiu carbon.enable.distributed.datamap is true,run long time, use bloomfilter to query, the exception is: 24274.0 (TID 664711) | org.apache.spark.internal.Logging$class.logError(Logging.scala:91) java.lang.NullPointerException at java.util.ArrayList.(ArrayList.java:177) at org.apache.carbondata.datamap.bloom.BloomCoarseGrainDataMap.prune(BloomCoarseGrainDataMap.java:230) at org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:379) at org.apache.carbondata.core.datamap.DistributableDataMapFormat$1.initialize(DistributableDataMapFormat.java:108) at org.apache.carbondata.spark.rdd.DataMapPruneRDD.internalCompute(SparkDataMapJob.scala:77) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9975/ ---
[GitHub] carbondata issue #2974: [CARBONDATA-2563][CATALYST] Explain query with Order...
Github user kumarvishal09 commented on the issue: https://github.com/apache/carbondata/pull/2974 retest this please ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1928/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9976/ ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900720 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateDDLForComplexMapType.scala --- @@ -0,0 +1,452 @@ +/* + +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to You under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at +* +http://www.apache.org/licenses/LICENSE-2.0 +* +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ +package org.apache.carbondata.spark.testsuite.createTable.TestCreateDDLForComplexMapType + +import java.io.File +import java.util + +import org.apache.hadoop.conf.Configuration +import org.apache.spark.sql.{AnalysisException, Row} +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk + +class TestCreateDDLForComplexMapType extends QueryTest with BeforeAndAfterAll { + private val conf: Configuration = new Configuration(false) + + val rootPath = new File(this.getClass.getResource("/").getPath + + "../../../..").getCanonicalPath + + val path = s"$rootPath/examples/spark2/src/main/resources/maptest2.csv" + + private def checkForLocalDictionary(dimensionRawColumnChunks: util + .List[DimensionRawColumnChunk]): Boolean = { +var isLocalDictionaryGenerated = false +import scala.collection.JavaConversions._ +for (dimensionRawColumnChunk <- dimensionRawColumnChunks) { + if (dimensionRawColumnChunk.getDataChunkV3 +.isSetLocal_dictionary) { +isLocalDictionaryGenerated = true + } --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900751 --- Diff: examples/spark2/src/main/resources/maptest2.csv --- @@ -0,0 +1,2 @@ +1\002Nalla\0012\002Singh\0011\002Gupta\0014\002Kumar +10\002Nallaa\00120\002Sissngh\001100\002Gusspta\00140\002Kumar --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900767 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java --- @@ -338,12 +338,15 @@ public static CarbonLoadModel getLoadModel(Configuration conf) throws IOExceptio SKIP_EMPTY_LINE, carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SKIP_EMPTY_LINE))); -String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":"); +String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":" + "," + "003"); --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900728 --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java --- @@ -338,12 +338,15 @@ public static CarbonLoadModel getLoadModel(Configuration conf) throws IOExceptio SKIP_EMPTY_LINE, carbonProperty.getProperty(CarbonLoadOptionConstants.CARBON_OPTIONS_SKIP_EMPTY_LINE))); -String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":"); +String complexDelim = conf.get(COMPLEX_DELIMITERS, "$" + "," + ":" + "," + "003"); String[] split = complexDelim.split(","); model.setComplexDelimiterLevel1(split[0]); if (split.length > 1) { model.setComplexDelimiterLevel2(split[1]); } +if (split.length > 2) { + model.setComplexDelimiterLevel3(split[2]); +} --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900706 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala --- @@ -188,11 +188,13 @@ case class CarbonLoadDataCommand( val carbonLoadModel = new CarbonLoadModel() val tableProperties = table.getTableInfo.getFactTable.getTableProperties val optionsFinal = LoadOption.fillOptionWithDefaultValue(options.asJava) +// These two delimiters are non configurable and hardcoded for map type +optionsFinal.put("complex_delimiter_level_3", "\003") +optionsFinal.put("complex_delimiter_level_4", "\004") --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900568 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java --- @@ -34,8 +37,12 @@ private int numberOfColumns; public RowParserImpl(DataField[] output, CarbonDataLoadConfiguration configuration) { -String[] complexDelimiters = +String[] tempComplexDelimiters = (String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); +Queue complexDelimiters = new LinkedList<>(); +for (int i = 0; i < 4; i++) { --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900612 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/CarbonParserFactory.java --- @@ -51,23 +54,37 @@ public static GenericParser createParser(CarbonColumn carbonColumn, String[] com * delimiters * @return GenericParser */ - private static GenericParser createParser(CarbonColumn carbonColumn, String[] complexDelimiters, + private static GenericParser createParser(CarbonColumn carbonColumn, + Queue complexDelimiters, String nullFormat, int depth) { +if (depth > 2) { + return null; --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900628 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/LoadOption.java --- @@ -119,6 +119,10 @@ "complex_delimiter_level_2", Maps.getOrDefault(options, "complex_delimiter_level_2", ":")); +optionsFinal.put( +"complex_delimiter_level_3", +Maps.getOrDefault(options, "complex_delimiter_level_3", "003")); + --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900688 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/DataLoadProcessBuilder.java --- @@ -222,8 +222,8 @@ public static CarbonDataLoadConfiguration createConfiguration(CarbonLoadModel lo configuration.setSegmentId(loadModel.getSegmentId()); configuration.setTaskNo(loadModel.getTaskNo()); configuration.setDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS, -new String[] { loadModel.getComplexDelimiterLevel1(), -loadModel.getComplexDelimiterLevel2() }); +new String[] { loadModel.getComplexDelimiterLevel1(), loadModel.getComplexDelimiterLevel2(), +loadModel.getComplexDelimiterLevel3(), loadModel.getComplexDelimiterLevel4() }); --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900666 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java --- @@ -631,7 +651,7 @@ public void setFactTimeStamp(long factTimeStamp) { } public String[] getDelimiters() { -return new String[] { complexDelimiterLevel1, complexDelimiterLevel2 }; +return new String[] { complexDelimiterLevel1, complexDelimiterLevel2, complexDelimiterLevel3 }; --- End diff -- done ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240900542 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java --- @@ -65,8 +65,7 @@ private String csvHeader; private String[] csvHeaderColumns; private String csvDelimiter; - private String complexDelimiterLevel1; - private String complexDelimiterLevel2; + private ArrayList complexDelimiters = new ArrayList<>(); --- End diff -- done ---
[GitHub] carbondata pull request #2949: [CARBONDATA-3118] support parallel block prun...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2949#discussion_r240900313 --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java --- @@ -205,26 +195,53 @@ public BlockletDetailsFetcher getBlockletDetailsFetcher() { final FilterResolverIntf filterExp, final List partitions, List blocklets, final Map> dataMaps, int totalFiles) { +/* + * + * Below is the example of how this part of code works. + * consider a scenario of having 5 segments, 10 datamaps in each segment, --- End diff -- BlockDatamap and blockletDatamap can store multiple files information. Each file is one row in that datamap. But non-default datamaps are not like that, so default datamaps distribution in multithread happens based on number of entries in datamaps, for non-default datamps distribution is based on number of datamaps (one datamap is considered as one record for non-default datamaps) ALso 10 datamap in a segment means, one merge index file has info of 10 index files ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1716/ ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1924/ ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9973/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1715/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2963 retest this please ---
[GitHub] carbondata issue #2979: [CARBONDATA-3153] Complex delimiters change
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2979 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1925/ ---
[GitHub] carbondata issue #2979: [CARBONDATA-3153] Complex delimiters change
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2979 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9974/ ---
[GitHub] carbondata issue #2979: [CARBONDATA-3153] Complex delimiters change
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2979 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1714/ ---
[GitHub] carbondata issue #2983: [CARBONDATA-3119] Fixed SDK Write for Complex Array ...
Github user ajantha-bhat commented on the issue: https://github.com/apache/carbondata/pull/2983 retest this please ---
[GitHub] carbondata issue #2983: [CARBONDATA-3119] Fixed SDK Write for Complex Array ...
Github user ajantha-bhat commented on the issue: https://github.com/apache/carbondata/pull/2983 @ravipesala , @chenliang613 : please add @shivamasn to whitelist ---
[GitHub] carbondata pull request #2983: [CARBONDATA-3119] Fixed SDK Write for Complex...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2983#discussion_r240889253 --- Diff: store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java --- @@ -2072,4 +1797,35 @@ public void testReadingNullValues() { } } + @Test public void testSdkWriteWhenArrayOfStringIsEmpty() + throws IOException, InvalidLoadOptionException { + +CarbonProperties.getInstance() +.addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FAIL"); --- End diff -- same as above, take a backup of CARBON_BAD_RECORDS_ACTION and set it at the end of the test case ---
[GitHub] carbondata pull request #2983: [CARBONDATA-3119] Fixed SDK Write for Complex...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2983#discussion_r240889185 --- Diff: store/sdk/src/test/java/org/apache/carbondata/sdk/file/CarbonReaderTest.java --- @@ -23,8 +23,11 @@ import java.util.*; import org.apache.avro.generic.GenericData; + --- End diff -- please revert unwanted changes, only your changes should be there in diff ---
[GitHub] carbondata pull request #2983: [CARBONDATA-3119] Fixed SDK Write for Complex...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2983#discussion_r240889074 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala --- @@ -39,6 +39,27 @@ class TestComplexDataType extends QueryTest with BeforeAndAfterAll { .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, badRecordAction) } + test("test Projection PushDown for Array - String type when Array is Empty") { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FAIL") --- End diff -- Take a previous CARBON_BAD_RECORDS_ACTION property as backup and set it back after this test case, else it affects other test suites ---
[GitHub] carbondata pull request #2983: [CARBONDATA-3119] Fixed SDK Write for Complex...
Github user ajantha-bhat commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2983#discussion_r240889108 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala --- @@ -39,6 +39,27 @@ class TestComplexDataType extends QueryTest with BeforeAndAfterAll { .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, badRecordAction) } + test("test Projection PushDown for Array - String type when Array is Empty") { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FAIL") +sql("drop table if exists table1") +sql("create table table1 (detail array) stored by 'carbondata'") +sql("insert into table1 values('')") +checkAnswer(sql("select detail[0] from table1"), Seq(Row(""))) +sql("drop table if exists table1") + } + + test("test Projection PushDown for Struct - Array type when Array is Empty") { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FAIL") --- End diff -- same as above ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1713/ ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240886853 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java --- @@ -631,7 +620,9 @@ public void setFactTimeStamp(long factTimeStamp) { } public String[] getDelimiters() { -return new String[] { complexDelimiterLevel1, complexDelimiterLevel2 }; +String[] delimiters = new String[complexDelimiters.size()]; +delimiters = complexDelimiters.toArray(delimiters); +return delimiters; --- End diff -- This method is not required. Conversion can be done whereever required ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishgupta88 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240886665 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModel.java --- @@ -65,8 +65,7 @@ private String csvHeader; private String[] csvHeaderColumns; private String csvDelimiter; - private String complexDelimiterLevel1; - private String complexDelimiterLevel2; + private ArrayList complexDelimiters = new ArrayList<>(); --- End diff -- We can do lazy initialization in the setter method. This will avoid extra memory being consumed for non complex type schemas ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1923/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9972/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1712/ ---
[jira] [Resolved] (CARBONDATA-3157) Integrate carbon lazy loading to presto carbon integration
[ https://issues.apache.org/jira/browse/CARBONDATA-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-3157. -- Resolution: Fixed Fix Version/s: 1.5.2 > Integrate carbon lazy loading to presto carbon integration > --- > > Key: CARBONDATA-3157 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3157 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Priority: Major > Fix For: 1.5.2 > > Time Spent: 8h > Remaining Estimate: 0h > > Integrate carbon lazy loading to presto carbon integration -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2978 ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2978 @chenliang613 column vector code (CarbonColumnVector interface) and a base implementation (CarbonColumnVectorImpl class) is in carbon-core module, but still for every engine integration layer, they need to be adapt to the compute engine. ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2978 LGTM ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1922/ ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2161 @chandrasaripaka OK. I will review this PR for detail this week, and plan to merge it recently. ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9971/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1711/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2963 retest this please ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user chandrasaripaka commented on the issue: https://github.com/apache/carbondata/pull/2161 @xubo245 , please merge the PR, after a review ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2161 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1921/ ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2161 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9970/ ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2161 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1710/ ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user chandrasaripaka commented on the issue: https://github.com/apache/carbondata/pull/2161 > @chandrasaripaka Can you fix the CI error? @xubo245 , just committed please review and let me know. ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9969/ ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1920/ ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/2978 For 1.5.2: Whether can consider merging vector code to core module from presto integration module for example CarbonVectorBatch, or not ? ---
[jira] [Resolved] (CARBONDATA-3116) set carbon.query.directQueryOnDataMap.enabled=true not working
[ https://issues.apache.org/jira/browse/CARBONDATA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-3116. -- Resolution: Fixed Fix Version/s: 1.5.2 > set carbon.query.directQueryOnDataMap.enabled=true not working > -- > > Key: CARBONDATA-3116 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3116 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.5.1 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 1.5.2 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > When I run: > {code:java} > spark.sql("drop table if exists mainTable") > spark.sql( > """CREATE TABLE mainTable > (id Int, > name String, > city String, > age Int) > STORED BY 'org.apache.carbondata.format'""".stripMargin); > spark.sql("LOAD DATA LOCAL INPATH > '/Users/xubo/Desktop/xubo/git/carbondata2/integration/spark-common-test/src/test/resources/sample.csv' > into table mainTable"); > spark.sql("create datamap preagg_sum on table mainTable using > 'preaggregate' as select id,sum(age) from mainTable group by id"); > spark.sql("show datamap on table mainTable"); > spark.sql("set carbon.query.directQueryOnDataMap.enabled=true"); > spark.sql("set carbon.query.directQueryOnDataMap.enabled"); > spark.sql("select count(*) from maintable_preagg_sum").show(); > spark.sql("select count(*) from maintable_preagg_sum").show(); > {code} > it will throw Exception > {code:java} > 2018-11-22 00:06:01 AUDIT audit:93 - {"time":"November 22, 2018 12:06:01 AM > CST","username":"xubo","opName":"SET","opId":"344656521959523","opStatus":"SUCCESS","opTime":"1 > ms","table":"NA","extraInfo":{}} > Exception in thread "main" org.apache.spark.sql.AnalysisException: Query On > DataMap not supported; > at > org.apache.spark.sql.optimizer.CarbonLateDecodeRule.validateQueryDirectlyOnDataMap(CarbonLateDecodeRule.scala:131) > at > org.apache.spark.sql.optimizer.CarbonLateDecodeRule.checkIfRuleNeedToBeApplied(CarbonLateDecodeRule.scala:79) > at > org.apache.spark.sql.optimizer.CarbonLateDecodeRule.apply(CarbonLateDecodeRule.scala:53) > at > org.apache.spark.sql.optimizer.CarbonLateDecodeRule.apply(CarbonLateDecodeRule.scala:47) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:85) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:82) > at > scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) > at scala.collection.immutable.List.foldLeft(List.scala:84) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:82) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:74) > at scala.collection.immutable.List.foreach(List.scala:381) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74) > at > org.apache.spark.sql.hive.CarbonOptimizer.execute(CarbonOptimizer.scala:35) > at > org.apache.spark.sql.hive.CarbonOptimizer.execute(CarbonOptimizer.scala:27) > at > org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:78) > at > org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:78) > at > org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84) > at > org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80) > at > org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89) > at > org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.head(Dataset.scala:2150) > at org.apache.spark.sql.Dataset.take(Dataset.scala:2363) > at org.apache.spark.sql.Dataset.showString(Dataset.scala:241) > at org.apache.spark.sql.Dataset.show(Dataset.scala:637) > at org.apache.spark.sql.Dataset.show(Dataset.scala:596) > at org.apache.spark.sql.Dataset.show(Dataset.scala:605) > at > org.apache.carbondata.examples.PreAggregateDataMapExample$.exampleBody(PreAggregateDataMapExample.scala:63) > at > org.apache.carbondata.examples.PreAggregateDataMapExample$.main(PreAggregateDataMapExample.scala:34) > at > org.apache.carbondata.examples.PreAggregateDataMapExample.main(PreAggregateDataMapExample.scala) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata pull request #2940: [CARBONDATA-3116] Support set carbon.query.di...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2940 ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9968/ ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1709/ ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1919/ ---
[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/2161 @chandrasaripaka Can you fix the CI error? ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1918/ ---
[GitHub] carbondata issue #2969: [CARBONDATA-3127]Fix the TestCarbonSerde exception
Github user zzcclp commented on the issue: https://github.com/apache/carbondata/pull/2969 LGTM ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9967/ ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1708/ ---
[GitHub] carbondata issue #2976: [CARBONDATA-2755][Complex DataType Enhancements] Com...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/2976 LGTM ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1707/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1917/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9964/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9966/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1706/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1916/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1914/ ---
[jira] [Created] (CARBONDATA-3163) If table has different time format, for no_sort columns data goes as bad record (null) for second table when loaded after first table.
Ajantha Bhat created CARBONDATA-3163: Summary: If table has different time format, for no_sort columns data goes as bad record (null) for second table when loaded after first table. Key: CARBONDATA-3163 URL: https://issues.apache.org/jira/browse/CARBONDATA-3163 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat If table has different time format, for no_sort columns data goes as bad record (null) for second table when loaded after first table. FilterProcessorTestCase.test("Between filter") -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3164) During no_sort, excpetion happend at converter step is not reached to user. same problem in SDK and spark file format flow also.
Ajantha Bhat created CARBONDATA-3164: Summary: During no_sort, excpetion happend at converter step is not reached to user. same problem in SDK and spark file format flow also. Key: CARBONDATA-3164 URL: https://issues.apache.org/jira/browse/CARBONDATA-3164 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat During no_sort, excpetion happend at converter step is not reached to user. same problem in SDK and spark file format flow also. TestLoadDataGeneral. test({color:#008000}"test load / insert / update with data more than 32000 bytes - dictionary_exclude"{color}) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2978 LGTM ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1915/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9965/ ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1911/ ---
[GitHub] carbondata issue #2980: [CARBONDATA-3017] Map DDL Support
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2980 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1705/ ---
[GitHub] carbondata pull request #2980: [CARBONDATA-3017] Map DDL Support
Github user manishnalla1994 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2980#discussion_r240591596 --- Diff: processing/src/main/java/org/apache/carbondata/processing/loading/parser/impl/RowParserImpl.java --- @@ -34,8 +37,12 @@ private int numberOfColumns; public RowParserImpl(DataField[] output, CarbonDataLoadConfiguration configuration) { -String[] complexDelimiters = +String[] tempComplexDelimiters = (String[]) configuration.getDataLoadProperty(DataLoadProcessorConstants.COMPLEX_DELIMITERS); +Queue complexDelimiters = new LinkedList<>(); +for (int i = 0; i < 4; i++) { --- End diff -- Done. ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9962/ ---
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/1912/ ---
[jira] [Created] (CARBONDATA-3162) Range filters doesn't remove null values for no_sort direct dictionary dimension columns.
Ajantha Bhat created CARBONDATA-3162: Summary: Range filters doesn't remove null values for no_sort direct dictionary dimension columns. Key: CARBONDATA-3162 URL: https://issues.apache.org/jira/browse/CARBONDATA-3162 Project: CarbonData Issue Type: Bug Reporter: Ajantha Bhat Assignee: Ajantha Bhat Range filters doesn't remove null values for no_sort direct dictionary dimension columns. TimestampDataTypeDirectDictionaryTest. test({color:#008000}"test timestamp with dictionary include and no_inverted index"{color}) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1704/ ---
[GitHub] carbondata issue #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap example
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2963 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1703/ ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/9960/ ---
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r240581684 --- Diff: integration/spark2/src/test/scala/org/apache/carbondata/datamap/minmax/MinMaxDataMapFunctionSuite.scala --- @@ -0,0 +1,415 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.datamap.minmax + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class MinMaxDataMapFunctionSuite extends QueryTest with BeforeAndAfterAll { + private val minmaxDataMapFactoryName = "org.apache.carbondata.datamap.minmax.MinMaxDataMapFactory" + var originalStatEnabled = CarbonProperties.getInstance().getProperty( +CarbonCommonConstants.ENABLE_QUERY_STATISTICS, +CarbonCommonConstants.ENABLE_QUERY_STATISTICS_DEFAULT) + + override protected def beforeAll(): Unit = { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS, "true") + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, + "-MM-dd") + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, + "-MM-dd HH:mm:ss") --- End diff -- I think this modification is OK. We explicitly specify the format here to indicate that this is just the format of our input data. (I'm afraid the default behavior will change later) ---
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r240580992 --- Diff: datamap/example/src/main/java/org/apache/carbondata/datamap/minmax/MinMaxDataMapFactory.java --- @@ -0,0 +1,365 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.datamap.minmax; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.ConcurrentHashMap; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.common.exceptions.sql.MalformedDataMapCommandException; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.Cache; +import org.apache.carbondata.core.cache.CacheProvider; +import org.apache.carbondata.core.cache.CacheType; +import org.apache.carbondata.core.datamap.DataMapDistributable; +import org.apache.carbondata.core.datamap.DataMapLevel; +import org.apache.carbondata.core.datamap.DataMapMeta; +import org.apache.carbondata.core.datamap.DataMapStoreManager; +import org.apache.carbondata.core.datamap.Segment; +import org.apache.carbondata.core.datamap.TableDataMap; +import org.apache.carbondata.core.datamap.dev.DataMapBuilder; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMap; +import org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMapFactory; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.features.TableOperation; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.metadata.schema.table.DataMapSchema; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn; +import org.apache.carbondata.core.scan.filter.intf.ExpressionType; +import org.apache.carbondata.core.statusmanager.SegmentStatusManager; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.path.CarbonTablePath; +import org.apache.carbondata.events.Event; + +import org.apache.log4j.Logger; + +/** + * Min Max DataMap Factory + */ +@InterfaceAudience.Internal +public class MinMaxDataMapFactory extends CoarseGrainDataMapFactory { + private static final Logger LOGGER = + LogServiceFactory.getLogService(MinMaxDataMapFactory.class.getName()); + private DataMapMeta dataMapMeta; + private String dataMapName; + // segmentId -> list of index files + private Map> segmentMap = new ConcurrentHashMap<>(); + private Cache cache; + + public MinMaxDataMapFactory(CarbonTable carbonTable, DataMapSchema dataMapSchema) + throws MalformedDataMapCommandException { +super(carbonTable, dataMapSchema); + +// this is an example for datamap, we can choose the columns and operations that +// will be supported by this datamap. Furthermore, we can add cache-support for this datamap. + +this.dataMapName = dataMapSchema.getDataMapName(); +List indexedColumns = carbonTable.getIndexedColumns(dataMapSchema); + +// operations that will be supported on the indexed columns +List optOperations = new ArrayList<>(); +optOperations.add(ExpressionType.NOT); +optOperations.add(ExpressionType.EQUALS); +optOperations.add(ExpressionType.NOT_EQUALS); +optOperations.add(ExpressionType.GREATERTHAN); +optOperations.add(ExpressionType.GREATERTHAN_EQUALTO); +optOperations.add(ExpressionType.LESSTHAN);
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user qiuchenjian commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r240581131 --- Diff: datamap/example/src/main/java/org/apache/carbondata/datamap/minmax/AbstractMinMaxDataMapWriter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.datamap.minmax; + +import java.io.DataOutputStream; +import java.io.IOException; +import java.math.BigDecimal; +import java.util.List; + +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.Segment; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.datastore.page.ColumnPage; +import org.apache.carbondata.core.datastore.page.encoding.bool.BooleanConvert; +import org.apache.carbondata.core.datastore.page.statistics.ColumnPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.KeyPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.PrimitivePageStatsCollector; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.DataTypes; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; + +import org.apache.log4j.Logger; + +/** + * We will record the min & max value for each index column in each blocklet. + * Since the size of index is quite small, we will combine the index for all index columns + * in one file. + */ +public abstract class AbstractMinMaxDataMapWriter extends DataMapWriter { + private static final Logger LOGGER = LogServiceFactory.getLogService( + AbstractMinMaxDataMapWriter.class.getName()); + + private ColumnPageStatsCollector[] indexColumnMinMaxCollectors; + protected int currentBlockletId; + private String currentIndexFile; + private DataOutputStream currentIndexFileOutStream; + + public AbstractMinMaxDataMapWriter(String tablePath, String dataMapName, + List indexColumns, Segment segment, String shardName) throws IOException { +super(tablePath, dataMapName, indexColumns, segment, shardName); +initStatsCollector(); +initDataMapFile(); + } + + private void initStatsCollector() { +indexColumnMinMaxCollectors = new ColumnPageStatsCollector[indexColumns.size()]; +CarbonColumn indexCol; +for (int i = 0; i < indexColumns.size(); i++) { + indexCol = indexColumns.get(i); + if (indexCol.isMeasure() + || (indexCol.isDimension() + && DataTypeUtil.isPrimitiveColumn(indexCol.getDataType()) + && !indexCol.hasEncoding(Encoding.DICTIONARY) + && !indexCol.hasEncoding(Encoding.DIRECT_DICTIONARY))) { +indexColumnMinMaxCollectors[i] = PrimitivePageStatsCollector.newInstance( +indexColumns.get(i).getDataType()); + } else { +indexColumnMinMaxCollectors[i] = KeyPageStatsCollector.newInstance(DataTypes.BYTE_ARRAY); + } +} + } + + private void initDataMapFile() throws IOException { +if (!FileFactory.isFileExist(dataMapPath) && +!FileFactory.mkdirs(dataMapPath, FileFactory.getFileType(dataMapPath))) { + throw new IOException("Failed to create directory " + dataMapPath); +} + +try { + currentIndexFile = MinMaxIndexDataMap.getIndexFile(dataMapPath, + MinMaxIndexHolder.MINMAX_INDEX_PREFFIX + indexColumns.size()); + FileFactory.createNewFile(currentIndexFile, FileFactory.getFileType(currentIndexFile)); + currentIndexFileOutStream =
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r240579947 --- Diff: datamap/example/src/main/java/org/apache/carbondata/datamap/minmax/AbstractMinMaxDataMapWriter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.datamap.minmax; + +import java.io.DataOutputStream; +import java.io.IOException; +import java.math.BigDecimal; +import java.util.List; + +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.Segment; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.datastore.page.ColumnPage; +import org.apache.carbondata.core.datastore.page.encoding.bool.BooleanConvert; +import org.apache.carbondata.core.datastore.page.statistics.ColumnPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.KeyPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.PrimitivePageStatsCollector; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.DataTypes; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; + +import org.apache.log4j.Logger; + +/** + * We will record the min & max value for each index column in each blocklet. + * Since the size of index is quite small, we will combine the index for all index columns + * in one file. + */ +public abstract class AbstractMinMaxDataMapWriter extends DataMapWriter { + private static final Logger LOGGER = LogServiceFactory.getLogService( + AbstractMinMaxDataMapWriter.class.getName()); + + private ColumnPageStatsCollector[] indexColumnMinMaxCollectors; + protected int currentBlockletId; + private String currentIndexFile; + private DataOutputStream currentIndexFileOutStream; + + public AbstractMinMaxDataMapWriter(String tablePath, String dataMapName, + List indexColumns, Segment segment, String shardName) throws IOException { +super(tablePath, dataMapName, indexColumns, segment, shardName); +initStatsCollector(); +initDataMapFile(); + } + + private void initStatsCollector() { +indexColumnMinMaxCollectors = new ColumnPageStatsCollector[indexColumns.size()]; +CarbonColumn indexCol; +for (int i = 0; i < indexColumns.size(); i++) { + indexCol = indexColumns.get(i); + if (indexCol.isMeasure() + || (indexCol.isDimension() + && DataTypeUtil.isPrimitiveColumn(indexCol.getDataType()) + && !indexCol.hasEncoding(Encoding.DICTIONARY) + && !indexCol.hasEncoding(Encoding.DIRECT_DICTIONARY))) { +indexColumnMinMaxCollectors[i] = PrimitivePageStatsCollector.newInstance( +indexColumns.get(i).getDataType()); + } else { +indexColumnMinMaxCollectors[i] = KeyPageStatsCollector.newInstance(DataTypes.BYTE_ARRAY); + } +} + } + + private void initDataMapFile() throws IOException { +if (!FileFactory.isFileExist(dataMapPath) && +!FileFactory.mkdirs(dataMapPath, FileFactory.getFileType(dataMapPath))) { + throw new IOException("Failed to create directory " + dataMapPath); +} + +try { + currentIndexFile = MinMaxIndexDataMap.getIndexFile(dataMapPath, + MinMaxIndexHolder.MINMAX_INDEX_PREFFIX + indexColumns.size()); + FileFactory.createNewFile(currentIndexFile, FileFactory.getFileType(currentIndexFile)); + currentIndexFileOutStream =
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r240579236 --- Diff: datamap/example/src/main/java/org/apache/carbondata/datamap/minmax/AbstractMinMaxDataMapWriter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.datamap.minmax; + +import java.io.DataOutputStream; +import java.io.IOException; +import java.math.BigDecimal; +import java.util.List; + +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.Segment; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.datastore.page.ColumnPage; +import org.apache.carbondata.core.datastore.page.encoding.bool.BooleanConvert; +import org.apache.carbondata.core.datastore.page.statistics.ColumnPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.KeyPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.PrimitivePageStatsCollector; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.DataTypes; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; + +import org.apache.log4j.Logger; + +/** + * We will record the min & max value for each index column in each blocklet. + * Since the size of index is quite small, we will combine the index for all index columns + * in one file. + */ +public abstract class AbstractMinMaxDataMapWriter extends DataMapWriter { + private static final Logger LOGGER = LogServiceFactory.getLogService( + AbstractMinMaxDataMapWriter.class.getName()); + + private ColumnPageStatsCollector[] indexColumnMinMaxCollectors; + protected int currentBlockletId; + private String currentIndexFile; + private DataOutputStream currentIndexFileOutStream; + + public AbstractMinMaxDataMapWriter(String tablePath, String dataMapName, + List indexColumns, Segment segment, String shardName) throws IOException { +super(tablePath, dataMapName, indexColumns, segment, shardName); +initStatsCollector(); +initDataMapFile(); + } + + private void initStatsCollector() { +indexColumnMinMaxCollectors = new ColumnPageStatsCollector[indexColumns.size()]; +CarbonColumn indexCol; +for (int i = 0; i < indexColumns.size(); i++) { + indexCol = indexColumns.get(i); + if (indexCol.isMeasure() + || (indexCol.isDimension() + && DataTypeUtil.isPrimitiveColumn(indexCol.getDataType()) + && !indexCol.hasEncoding(Encoding.DICTIONARY) + && !indexCol.hasEncoding(Encoding.DIRECT_DICTIONARY))) { +indexColumnMinMaxCollectors[i] = PrimitivePageStatsCollector.newInstance( +indexColumns.get(i).getDataType()); + } else { +indexColumnMinMaxCollectors[i] = KeyPageStatsCollector.newInstance(DataTypes.BYTE_ARRAY); + } +} + } + + private void initDataMapFile() throws IOException { +if (!FileFactory.isFileExist(dataMapPath) && +!FileFactory.mkdirs(dataMapPath, FileFactory.getFileType(dataMapPath))) { + throw new IOException("Failed to create directory " + dataMapPath); +} + +try { + currentIndexFile = MinMaxIndexDataMap.getIndexFile(dataMapPath, + MinMaxIndexHolder.MINMAX_INDEX_PREFFIX + indexColumns.size()); + FileFactory.createNewFile(currentIndexFile, FileFactory.getFileType(currentIndexFile)); + currentIndexFileOutStream =
[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...
Github user xuchuanyin commented on a diff in the pull request: https://github.com/apache/carbondata/pull/2963#discussion_r240578382 --- Diff: datamap/example/src/main/java/org/apache/carbondata/datamap/minmax/AbstractMinMaxDataMapWriter.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.datamap.minmax; + +import java.io.DataOutputStream; +import java.io.IOException; +import java.math.BigDecimal; +import java.util.List; + +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.Segment; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.datastore.page.ColumnPage; +import org.apache.carbondata.core.datastore.page.encoding.bool.BooleanConvert; +import org.apache.carbondata.core.datastore.page.statistics.ColumnPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.KeyPageStatsCollector; +import org.apache.carbondata.core.datastore.page.statistics.PrimitivePageStatsCollector; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.DataTypes; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; + +import org.apache.log4j.Logger; + +/** + * We will record the min & max value for each index column in each blocklet. + * Since the size of index is quite small, we will combine the index for all index columns + * in one file. + */ +public abstract class AbstractMinMaxDataMapWriter extends DataMapWriter { + private static final Logger LOGGER = LogServiceFactory.getLogService( + AbstractMinMaxDataMapWriter.class.getName()); + + private ColumnPageStatsCollector[] indexColumnMinMaxCollectors; + protected int currentBlockletId; + private String currentIndexFile; + private DataOutputStream currentIndexFileOutStream; + + public AbstractMinMaxDataMapWriter(String tablePath, String dataMapName, + List indexColumns, Segment segment, String shardName) throws IOException { +super(tablePath, dataMapName, indexColumns, segment, shardName); +initStatsCollector(); +initDataMapFile(); + } + + private void initStatsCollector() { +indexColumnMinMaxCollectors = new ColumnPageStatsCollector[indexColumns.size()]; +CarbonColumn indexCol; +for (int i = 0; i < indexColumns.size(); i++) { + indexCol = indexColumns.get(i); + if (indexCol.isMeasure() + || (indexCol.isDimension() + && DataTypeUtil.isPrimitiveColumn(indexCol.getDataType()) + && !indexCol.hasEncoding(Encoding.DICTIONARY) + && !indexCol.hasEncoding(Encoding.DIRECT_DICTIONARY))) { +indexColumnMinMaxCollectors[i] = PrimitivePageStatsCollector.newInstance( +indexColumns.get(i).getDataType()); + } else { +indexColumnMinMaxCollectors[i] = KeyPageStatsCollector.newInstance(DataTypes.BYTE_ARRAY); + } +} + } + + private void initDataMapFile() throws IOException { +if (!FileFactory.isFileExist(dataMapPath) && +!FileFactory.mkdirs(dataMapPath, FileFactory.getFileType(dataMapPath))) { + throw new IOException("Failed to create directory " + dataMapPath); +} + +try { + currentIndexFile = MinMaxIndexDataMap.getIndexFile(dataMapPath, + MinMaxIndexHolder.MINMAX_INDEX_PREFFIX + indexColumns.size()); + FileFactory.createNewFile(currentIndexFile, FileFactory.getFileType(currentIndexFile)); + currentIndexFileOutStream =
[jira] [Created] (CARBONDATA-3161) Pipe "|" dilimiter is not working for streaming table
Pawan Malwal created CARBONDATA-3161: Summary: Pipe "|" dilimiter is not working for streaming table Key: CARBONDATA-3161 URL: https://issues.apache.org/jira/browse/CARBONDATA-3161 Project: CarbonData Issue Type: Bug Components: data-load Reporter: Pawan Malwal Assignee: Pawan Malwal csv data with "|" as a dilimiter is not getting loaded into streaming table correctly. *DDL:* create table table1_st(begintime TIMESTAMP, deviceid STRING, statcycle INT, topologypath STRING, devicetype STRING, rebootnum INT) stored by 'carbondata' TBLPROPERTIES('SORT_SCOPE'='GLOBAL_SORT','sort_columns'='deviceid,begintime','streaming' ='true'); *Run in spark shell:* import org.apache.spark.sql.SparkSession; import org.apache.spark.sql.SparkSession.Builder; import org.apache.spark.sql.CarbonSession; import org.apache.spark.sql.CarbonSession.CarbonBuilder; import org.apache.spark.sql.streaming._ import org.apache.carbondata.streaming.parser._ val enableHiveSupport = SparkSession.builder().enableHiveSupport(); val carbon=new CarbonBuilder(enableHiveSupport).getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/") val df=carbon.readStream.text("/user/*.csv") val qrymm_0001 = df.writeStream.format("carbondata").option(CarbonStreamParser.CARBON_STREAM_PARSER, CarbonStreamParser.CARBON_STREAM_PARSER_CSV).{color:#FF}*option("delimiter","|")*{color}.option("header","false").option("dbName","stdb").option("checkpointLocation", "/tmp/tb1").option("bad_records_action","FORCE").option("tableName","table1_st").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("TIMESTAMPFORMAT","-dd-MM HH:mm:ss").start *Sample records:* begintime| deviceid| statcycle| topologypath| devicetype| rebootnum 2018-10-01 00:00:00|Device1|0|dsad|STB|9 2018-10-01 00:05:00|Device1|0|Rsad|STB|4 2018-10-01 00:10:00|Device1|0|fsf|STB|6 2018-10-01 00:15:00|Device1|0|fdgf|STB|8 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1702/ ---
[GitHub] carbondata issue #2969: [CARBONDATA-3127]Fix the TestCarbonSerde exception
Github user xuchuanyin commented on the issue: https://github.com/apache/carbondata/pull/2969 LGTM ---
[jira] [Resolved] (CARBONDATA-3147) Preaggregate dataload fails in case of concurrent load in some cases
[ https://issues.apache.org/jira/browse/CARBONDATA-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-3147. - Resolution: Fixed Fix Version/s: 1.5.2 > Preaggregate dataload fails in case of concurrent load in some cases > > > Key: CARBONDATA-3147 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3147 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Major > Fix For: 1.5.2 > > Time Spent: 6h 50m > Remaining Estimate: 0h > > java.io.IOException: Entry not found to update in the table status file > at > org.apache.carbondata.processing.util.CarbonLoaderUtil.recordNewLoadMetadata(CarbonLoaderUtil.java:320) > at > org.apache.carbondata.processing.util.CarbonLoaderUtil.recordNewLoadMetadata(CarbonLoaderUtil.java:207) > at > org.apache.carbondata.processing.util.CarbonLoaderUtil.updateTableStatusForFailure(CarbonLoaderUtil.java:467) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:358) > at > org.apache.spark.sql.execution.command.preaaggregate.PreAggregateUtil$.startDataLoadForDataMap(PreAggregateUtil.scala:603) > at > org.apache.spark.sql.execution.command.preaaggregate.LoadPostAggregateListener$$anonfun$onEvent$10.apply(PreAggregateListeners.scala:488) > at > org.apache.spark.sql.execution.command.preaaggregate.LoadPostAggregateListener$$anonfun$onEvent$10.apply(PreAggregateListeners.scala:463) > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732) > at > org.apache.spark.sql.execution.command.preaaggregate.LoadPostAggregateListener$.onEvent(PreAggregateListeners.scala:463) > at > org.apache.carbondata.events.OperationListenerBus.fireEvent(OperationListenerBus.java:83) > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:524) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:594) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:322) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:147) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:144) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:140) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:144) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:59) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:57) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:75) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:125) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:125) > at org.apache.spark.sql.Dataset.(Dataset.scala:185) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:90) > at > org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:89) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:135) > at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:87) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:252) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:183) > at >
[GitHub] carbondata issue #2966: [WIP] test and check no sort by default
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2966 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1701/ ---
[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/2978 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/1700/ ---
[GitHub] carbondata pull request #2977: [CARBONDATA-3147] Fixed concurrent load issue
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/2977 ---