[GitHub] carbondata issue #1387: [CARBONDATA-1503][WIP]Support CarbonFileInputFormat
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1387 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/422/ ---
[GitHub] carbondata issue #1387: [CARBONDATA-1503][WIP]Support CarbonFileInputFormat
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1387 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1052/ ---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1361 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1051/ ---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1361 Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/297/ ---
[GitHub] carbondata issue #1361: [CARBONDATA-1481] Add test cases for compaction of g...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1361 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/421/ ---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143902837 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortFunctionTest.scala --- @@ -0,0 +1,535 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the"License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an"AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, FilenameFilter} + +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class CompactionSupportGlobalSortFunctionTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val filePath: String = s"$resourcesPath/globalsort" + val file1: String = resourcesPath + "/globalsort/sample1.csv" + val file2: String = resourcesPath + "/globalsort/sample2.csv" + val file3: String = resourcesPath + "/globalsort/sample3.csv" + + override def beforeEach { +resetConf +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("Compaction type: major") { +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort") + +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort") + +sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'") --- End diff -- I add some parameter tests for major compaction in CompactionSupportGlobalSortParameterTest.scala ---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143902415 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortParameterTest.scala --- @@ -0,0 +1,298 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the"License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an"AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, FilenameFilter} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +class CompactionSupportGlobalSortParameterTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val filePath: String = s"$resourcesPath/globalsort" + val file1: String = resourcesPath + "/globalsort/sample1.csv" + val file2: String = resourcesPath + "/globalsort/sample2.csv" + val file3: String = resourcesPath + "/globalsort/sample3.csv" + + override def beforeEach { +resetConf +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("ENABLE_AUTO_LOAD_MERGE: false") { + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "false") +for (i <- 0 until 2) { + sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort") + sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort") + sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort") + + sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')") + sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')") + sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')") +} +checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort") + +checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name") + +sql("delete from table compaction_globalsort where SEGMENT.ID in (1,2,3)") +sql("delete from table carbon_localsort where SEGMENT.ID in (1,2,3)") +sql("ALTER TABLE compaction_globalsort COMPACT 'minor'") +checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), false, "Compacted") + +val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort") +val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) } +assert(!SegmentSequenceIds.contains("0.1")) +assert(SegmentSequenceIds.length == 6) + +checkAnswer(sql("SELECT COUNT(*) FROM compaction_globalsort"), Seq(Row(12))) + +checkAnswer(sql("SELECT * FROM compaction_globalsort"), + sql("SELECT * FROM carbon_localsort")) + +checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Success") +checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Marked for Delete") + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, +
[GitHub] carbondata issue #1404: [CARBONDATA-1541] There are some errors when bad_rec...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1404 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1050/ ---
[GitHub] carbondata issue #1404: [CARBONDATA-1541] There are some errors when bad_rec...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1404 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/296/ ---
[GitHub] carbondata issue #1404: [CARBONDATA-1541] There are some errors when bad_rec...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1404 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/420/ ---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143898742 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, PrintWriter} + +import scala.util.Random + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val file1 = resourcesPath + "/compaction/fil1.csv" + val file2 = resourcesPath + "/compaction/fil2.csv" + val file3 = resourcesPath + "/compaction/fil3.csv" + val file4 = resourcesPath + "/compaction/fil4.csv" + val file5 = resourcesPath + "/compaction/fil5.csv" + + override protected def beforeAll(): Unit = { +resetConf("10") +//n should be about 500 of reset if size is default 1024 +val n = 15 +CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0) +CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n) +CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5) +CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8) +CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13) + } + + override protected def afterAll(): Unit = { +CompactionSupportGlobalSortBigFileTest.deleteFile(file1) +CompactionSupportGlobalSortBigFileTest.deleteFile(file2) +CompactionSupportGlobalSortBigFileTest.deleteFile(file3) +CompactionSupportGlobalSortBigFileTest.deleteFile(file4) +CompactionSupportGlobalSortBigFileTest.deleteFile(file5) +resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE) + } + + override def beforeEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("Compaction major: segments size is bigger than default compaction size") { +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')") + +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143898738 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, PrintWriter} + +import scala.util.Random + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val file1 = resourcesPath + "/compaction/fil1.csv" + val file2 = resourcesPath + "/compaction/fil2.csv" + val file3 = resourcesPath + "/compaction/fil3.csv" + val file4 = resourcesPath + "/compaction/fil4.csv" + val file5 = resourcesPath + "/compaction/fil5.csv" + + override protected def beforeAll(): Unit = { +resetConf("10") +//n should be about 500 of reset if size is default 1024 +val n = 15 +CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0) +CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n) +CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5) +CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8) +CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13) + } + + override protected def afterAll(): Unit = { +CompactionSupportGlobalSortBigFileTest.deleteFile(file1) +CompactionSupportGlobalSortBigFileTest.deleteFile(file2) +CompactionSupportGlobalSortBigFileTest.deleteFile(file3) +CompactionSupportGlobalSortBigFileTest.deleteFile(file4) +CompactionSupportGlobalSortBigFileTest.deleteFile(file5) +resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE) + } + + override def beforeEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("Compaction major: segments size is bigger than default compaction size") { +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')") + +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH
[GitHub] carbondata pull request #1404: [CARBONDATA-1541] There are some errors when ...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1404#discussion_r143897498 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/LoadDataWithBadRecords.scala --- @@ -0,0 +1,180 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.integration.spark.testsuite.dataload + +import java.io.File + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +class LoadDataWithBadRecords extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { --- End diff -- Ok, I have add Test postfix to class name. ---
[GitHub] carbondata pull request #1404: [CARBONDATA-1541] There are some errors when ...
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1404#discussion_r143897439 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/LoadDataWithBadRecords.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.integration.spark.testsuite.dataload + +import java.io.File + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +class LoadDataWithBadRecordsTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + override def beforeEach(): Unit = { +sql("drop table if exists sales") +sql("drop table if exists int_table") +sql("drop table if exists boolean_table") +sql( + """CREATE TABLE IF NOT EXISTS sales(ID BigInt, date Timestamp, country String, + actual_price Double, Quantity int, sold_price Decimal(19,2)) STORED BY 'carbondata'""") +sql("CREATE TABLE if not exists int_table(intField INT) STORED BY 'carbondata'") +sql("CREATE TABLE if not exists boolean_table(booleanField INT) STORED BY 'carbondata'") + } + + override def beforeAll(): Unit = { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, +new File("./target/test/badRecords") + .getCanonicalPath) + } + + override def afterAll(): Unit = { +sql("drop table if exists sales") +sql("drop table if exists int_table") +sql("drop table if exists boolean_table") +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT) + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT) + .addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, CarbonCommonConstants.CARBON_BADRECORDS_LOC_DEFAULT_VAL) + } + + val rootPath = new File(this.getClass.getResource("/").getPath ++ "../../../../").getCanonicalPath + + val path = s"$rootPath/integration/spark-common-test/src/test/resources/badrecords/datasample.csv" + + test("The bad_records_action: FORCE") { +sql("LOAD DATA local inpath '" + path + "' INTO TABLE sales OPTIONS" + --- End diff -- Ok, I have changed it. ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1407 @jackylk Please review it. ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1407 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1049/ ---
[GitHub] carbondata issue #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1362 @jackylk I have added some descriptions for test cases in this PR. ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1407 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/419/ ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1407 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/295/ ---
[jira] [Updated] (CARBONDATA-1506) SDV tests error in CI
[ https://issues.apache.org/jira/browse/CARBONDATA-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated CARBONDATA-1506: Description: Sometimes, there is error in SDV test: == Results == !== Correct Answer - 1 == == Spark Answer - 1 == ![0.0] [1.0E-5] sometimes it is correct. It's better to fix this error. failed CI like: http://144.76.159.231:8080/job/ApacheSDVTests/879/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC075/ http://144.76.159.231:8080/job/ApacheSDVTests/841/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC074/ http://144.76.159.231:8080/job/ApacheSDVTests/797/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC074/ http://144.76.159.231:8080/job/ApacheSDVTests/1037/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC076/ http://144.76.159.231:8080/job/ApacheSDVTests/1042/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC076/ was: Sometimes, there is error in SDV test: == Results == !== Correct Answer - 1 == == Spark Answer - 1 == ![0.0] [1.0E-5] sometimes it is correct. It's better to fix this error. failed CI like: http://144.76.159.231:8080/job/ApacheSDVTests/879/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC075/ http://144.76.159.231:8080/job/ApacheSDVTests/841/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC074/ http://144.76.159.231:8080/job/ApacheSDVTests/797/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC074/ http://144.76.159.231:8080/job/ApacheSDVTests/1037/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC076/ > SDV tests error in CI > - > > Key: CARBONDATA-1506 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1506 > Project: CarbonData > Issue Type: Bug > Components: test >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Original Estimate: 480h > Time Spent: 1.5h > Remaining Estimate: 478.5h > > Sometimes, there is error in SDV test: > == Results == > !== Correct Answer - 1 == == Spark Answer - 1 == > ![0.0] [1.0E-5] > sometimes it is correct. > It's better to fix this error. > failed CI like: > http://144.76.159.231:8080/job/ApacheSDVTests/879/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC075/ > http://144.76.159.231:8080/job/ApacheSDVTests/841/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC074/ > http://144.76.159.231:8080/job/ApacheSDVTests/797/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC074/ > http://144.76.159.231:8080/job/ApacheSDVTests/1037/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC076/ > http://144.76.159.231:8080/job/ApacheSDVTests/1042/testReport/junit/org.apache.carbondata.cluster.sdv.generated/QueriesBasicTestCase/PushUP_FILTER_uniqdata_TC076/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143891060 --- Diff: integration/spark2/src/test/scala/org/apache/carbondata/spark/testsuite/booleantype/BooleanDataTypesLoadTest.scala --- @@ -0,0 +1,655 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.spark.testsuite.booleantype + +import java.io.File + +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +/** + * Created by root on 9/17/17. + */ +class BooleanDataTypesLoadTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val rootPath = new File(this.getClass.getResource("/").getPath ++ "../../../..").getCanonicalPath + --- End diff -- OK, I add some testcase to use unsafe in BooleanDataTypesLoadTest.scala and BooleanDataTypesBigFileTest.scala ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890886 --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/TablePage.java --- @@ -187,6 +188,12 @@ private void convertToColumnarAndAddToPages(int rowId, CarbonRow row, byte[] mdk value != null) { value = ((Decimal) value).toJavaBigDecimal(); } + if (measurePages[i].getColumnSpec().getSchemaDataType() + == DataType.BOOLEAN && value != null) { --- End diff -- Ok, I have changed it. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890910 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -361,6 +360,10 @@ public void putData(int rowId, Object value) { return; } switch (dataType) { + case BOOLEAN: +putByte(rowId, (byte) value); --- End diff -- I have deleted it. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890846 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/bool/BooleanConvert.java --- @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.datastore.page.encoding.bool; + +/** + * convert tools for boolean data type + */ +public class BooleanConvert { + + public static final byte trueValue = 1; + public static final byte falseValue = 0; + /** + * convert boolean to byte + * + * @param data data of boolean data type + * @return byte type data by convert + */ + public static byte boolean2Byte(boolean data) { +return data ? (byte) 1 : (byte) 0; --- End diff -- Ok. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890823 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java --- @@ -215,7 +215,7 @@ private void initMeasureBlockIndexes() { } else { // specific for restructure case where default values need to be filled pageNumbers = blockChunkHolder.getDataBlock().numberOfPages(); -numberOfRows = new int[] { blockChunkHolder.getDataBlock().nodeSize() }; +numberOfRows = new int[] { blockChunkHolder.getDataBlock().nodeSize()}; --- End diff -- Ok. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890833 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/bool/BooleanConvert.java --- @@ -0,0 +1,63 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.datastore.page.encoding.bool; + +/** + * convert tools for boolean data type + */ +public class BooleanConvert { + + public static final byte trueValue = 1; + public static final byte falseValue = 0; + /** + * convert boolean to byte + * + * @param data data of boolean data type + * @return byte type data by convert + */ + public static byte boolean2Byte(boolean data) { +return data ? (byte) 1 : (byte) 0; + } + + /** + * convert byte to boolean + * + * @param data byte type data + * @return boolean type data + */ + public static boolean byte2Boolean(int data) { +return data == 1 ? true : false; --- End diff -- Ok. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890779 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/EncodingFactory.java --- @@ -38,11 +38,7 @@ import org.apache.carbondata.core.util.CarbonUtil; import org.apache.carbondata.format.Encoding; -import static org.apache.carbondata.format.Encoding.ADAPTIVE_DELTA_INTEGRAL; -import static org.apache.carbondata.format.Encoding.ADAPTIVE_FLOATING; -import static org.apache.carbondata.format.Encoding.ADAPTIVE_INTEGRAL; -import static org.apache.carbondata.format.Encoding.DIRECT_COMPRESS; -import static org.apache.carbondata.format.Encoding.RLE_INTEGRAL; +import static org.apache.carbondata.format.Encoding.*; --- End diff -- I also changed similar problem in columnPage ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890810 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/filter/executer/RowLevelFilterExecuterImpl.java --- @@ -203,7 +203,7 @@ private void initMeasureBlockIndexes() { } else { // specific for restructure case where default values need to be filled pageNumbers = blockChunkHolder.getDataBlock().numberOfPages(); -numberOfRows = new int[] { blockChunkHolder.getDataBlock().nodeSize() }; +numberOfRows = new int[] { blockChunkHolder.getDataBlock().nodeSize()}; --- End diff -- Ok. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890703 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/EncodingFactory.java --- @@ -38,11 +38,7 @@ import org.apache.carbondata.core.util.CarbonUtil; import org.apache.carbondata.format.Encoding; -import static org.apache.carbondata.format.Encoding.ADAPTIVE_DELTA_INTEGRAL; -import static org.apache.carbondata.format.Encoding.ADAPTIVE_FLOATING; -import static org.apache.carbondata.format.Encoding.ADAPTIVE_INTEGRAL; -import static org.apache.carbondata.format.Encoding.DIRECT_COMPRESS; -import static org.apache.carbondata.format.Encoding.RLE_INTEGRAL; +import static org.apache.carbondata.format.Encoding.*; --- End diff -- Ok, I have changed it. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890685 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -530,6 +546,11 @@ private void putNull(int rowId) { public abstract byte[] getShortIntPage(); /** + * Get boolean value page + */ + public abstract byte[] getBooleanPage(); --- End diff -- Ok, I have changed it. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890419 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -485,6 +496,11 @@ private void putNull(int rowId) { public abstract int getShortInt(int rowId); /** + * Get boolean value at rowId + */ + public abstract boolean getBoolean(int rowId); --- End diff -- ok, I have changed it ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890454 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -321,6 +315,11 @@ private static ColumnPage newLVBytesPage(TableSpec.ColumnSpec columnSpec, public abstract void setShortIntPage(byte[] shortIntData); /** + * Set boolean values to page + */ + public abstract void setBooleanPage(byte[] booleanData); --- End diff -- ok, I have removed it ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890406 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -321,6 +315,11 @@ private static ColumnPage newLVBytesPage(TableSpec.ColumnSpec columnSpec, public abstract void setShortIntPage(byte[] shortIntData); /** + * Set boolean values to page + */ + public abstract void setBooleanPage(byte[] booleanData); --- End diff -- ok, I have changed it ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890387 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -436,6 +439,11 @@ public void putData(int rowId, Object value) { public abstract void putShortInt(int rowId, int value); /** + * Set boolean value at rowId + */ + public abstract void putBoolean(int rowId, boolean value); --- End diff -- ok, I have changed it ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890342 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/bool/BooleanEncoderMeta.java --- @@ -0,0 +1,40 @@ +package org.apache.carbondata.core.datastore.page.encoding.bool; + +import java.io.DataInput; +import java.io.DataOutput; +import java.io.IOException; + +import org.apache.carbondata.core.datastore.TableSpec; +import org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoderMeta; +import org.apache.carbondata.core.datastore.page.statistics.SimpleStatsResult; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.schema.table.Writable; + +public class BooleanEncoderMeta extends ColumnPageEncoderMeta implements Writable { + private String compressorName; + + public BooleanEncoderMeta() { + } + + public BooleanEncoderMeta(TableSpec.ColumnSpec columnSpec, DataType storeDataType, +SimpleStatsResult stats, String compressorName) { +super(columnSpec,storeDataType,stats,compressorName); +this.compressorName = compressorName; --- End diff -- ok ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890311 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -271,6 +267,18 @@ private static ColumnPage newIntPage(TableSpec.ColumnSpec columnSpec, int[] intD return columnPage; } + private static ColumnPage newBooleanPage(TableSpec.ColumnSpec columnSpec, byte[] booleanData) { +ColumnPage columnPage = createPage(columnSpec, BOOLEAN, booleanData.length); --- End diff -- ok, I use newBytePage to replace newBooleanPage ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890228 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java --- @@ -187,6 +179,7 @@ public static ColumnPage newPage(TableSpec.ColumnSpec columnSpec, DataType dataT case BYTE: case SHORT: case SHORT_INT: +case BOOLEAN: --- End diff -- ok ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890183 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/bool/BooleanConvert.java --- @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.datastore.page.encoding.bool; + +/** + * convert tools for boolean data type + */ +public class BooleanConvert { --- End diff -- I think it should be reserved. There are some place invoke it. ---
[GitHub] carbondata pull request #1362: [CARBONDATA-1444] Support Boolean data type
Github user xubo245 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1362#discussion_r143890027 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/EncodingFactory.java --- @@ -92,6 +90,10 @@ public ColumnPageDecoder createDecoder(List encodings, List
[GitHub] carbondata issue #1398: [CARBONDATA-1537] Fixed version compatabilty issues ...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1398 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1048/ ---
[GitHub] carbondata issue #1398: [CARBONDATA-1537] Fixed version compatabilty issues ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1398 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/418/ ---
[GitHub] carbondata issue #1398: [CARBONDATA-1537] Fixed version compatabilty issues ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1398 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/294/ ---
[GitHub] carbondata pull request #1398: [CARBONDATA-1537] Fixed version compatabilty ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1398#discussion_r143816326 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala --- @@ -43,7 +43,7 @@ class CarbonSession(@transient val sc: SparkContext, } @transient - override private[sql] lazy val sessionState: SessionState = new CarbonSessionState(this) + override lazy val sessionState: SessionState = new CarbonSessionState(this) --- End diff -- Yes, it is required for test case I added. ---
[GitHub] carbondata pull request #1398: [CARBONDATA-1537] Fixed version compatabilty ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1398#discussion_r143816210 --- Diff: integration/spark-common-cluster-test/src/test/scala/org/apache/carbondata/cluster/sdv/generated/CarbonV1toV3CompatabilityTestCase.scala --- @@ -0,0 +1,87 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.cluster.sdv.generated + +import org.apache.spark.sql.common.util.QueryTest +import org.apache.spark.sql.hive.CarbonSessionState +import org.apache.spark.sql.test.TestQueryExecutor +import org.apache.spark.sql.{CarbonEnv, CarbonSession, Row, SparkSession} +import org.scalatest.BeforeAndAfterAll + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +/** + * V1 to V3 compatability test. This test has to be at last --- End diff -- yes, i will tests for V2 to V3 in another PR, Because I need to verify the compatibility of V2 to V3 first ---
[GitHub] carbondata pull request #1398: [CARBONDATA-1537] Fixed version compatabilty ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1398#discussion_r143815996 --- Diff: core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverter.java --- @@ -123,6 +123,25 @@ private BlockletInfo getBlockletInfo( } @Override public List getSchema(TableBlockInfo tableBlockInfo) throws IOException { -return null; +FileHolder fileReader = null; --- End diff -- yes, it is part of this PR. We need to get columnschema ---
[GitHub] carbondata pull request #1398: [CARBONDATA-1537] Fixed version compatabilty ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1398#discussion_r143815676 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -2070,6 +1994,20 @@ public static void dropDatabaseDirectory(String dbName, String storePath) } } + public static DataType getDataType(char type) { --- End diff -- ok ---
[GitHub] carbondata issue #1402: [CARBONDATA-1539] Change data type from enum to clas...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1402 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1047/ ---
[GitHub] carbondata issue #1402: [CARBONDATA-1539] Change data type from enum to clas...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1402 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/293/ ---
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1359 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/416/ ---
[GitHub] carbondata pull request #1386: [CARBONDATA-1513] bad-record for complex data...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1386#discussion_r143782234 --- Diff: integration/spark-common-test/src/test/resources/complexdatawithheader.csv --- @@ -0,0 +1,101 @@ + deviceInformationId,channelsId,ROMSize,purchasedate,mobile,MAC,locationinfo,proddate,gamePointId,contractNumber --- End diff -- Is it required to add 100 rows? Can it be less rows for testcase to work? ---
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1359 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/292/ ---
[GitHub] carbondata pull request #1404: [CARBONDATA-1541] There are some errors when ...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1404#discussion_r143780104 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/LoadDataWithBadRecords.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.integration.spark.testsuite.dataload + +import java.io.File + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +class LoadDataWithBadRecordsTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + override def beforeEach(): Unit = { +sql("drop table if exists sales") +sql("drop table if exists int_table") +sql("drop table if exists boolean_table") +sql( + """CREATE TABLE IF NOT EXISTS sales(ID BigInt, date Timestamp, country String, + actual_price Double, Quantity int, sold_price Decimal(19,2)) STORED BY 'carbondata'""") +sql("CREATE TABLE if not exists int_table(intField INT) STORED BY 'carbondata'") +sql("CREATE TABLE if not exists boolean_table(booleanField INT) STORED BY 'carbondata'") + } + + override def beforeAll(): Unit = { +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, +new File("./target/test/badRecords") + .getCanonicalPath) + } + + override def afterAll(): Unit = { +sql("drop table if exists sales") +sql("drop table if exists int_table") +sql("drop table if exists boolean_table") +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT) + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT) + .addProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC, CarbonCommonConstants.CARBON_BADRECORDS_LOC_DEFAULT_VAL) + } + + val rootPath = new File(this.getClass.getResource("/").getPath ++ "../../../../").getCanonicalPath + + val path = s"$rootPath/integration/spark-common-test/src/test/resources/badrecords/datasample.csv" + + test("The bad_records_action: FORCE") { +sql("LOAD DATA local inpath '" + path + "' INTO TABLE sales OPTIONS" + --- End diff -- use ``` s""" """ ``` instead of `+` ---
[GitHub] carbondata pull request #1408: [CARBONDATA-1568] Optimize annotation of code
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1408#discussion_r143777860 --- Diff: integration/spark/src/main/scala/org/apache/carbondata/spark/CarbonDataFrameWriter.scala --- @@ -59,8 +59,8 @@ class CarbonDataFrameWriter(val dataFrame: DataFrame) { /** * Firstly, saving DataFrame to CSV files * Secondly, load CSV files - * @param options - * @param cc + * @param options CarbonOption --- End diff -- add more explain for parameters. ---
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user sounakr commented on the issue: https://github.com/apache/carbondata/pull/1359 Retest this please. ---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143777369 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortParameterTest.scala --- @@ -0,0 +1,298 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the"License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an"AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, FilenameFilter} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +class CompactionSupportGlobalSortParameterTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val filePath: String = s"$resourcesPath/globalsort" + val file1: String = resourcesPath + "/globalsort/sample1.csv" + val file2: String = resourcesPath + "/globalsort/sample2.csv" + val file3: String = resourcesPath + "/globalsort/sample3.csv" + + override def beforeEach { +resetConf +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("ENABLE_AUTO_LOAD_MERGE: false") { + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, "false") +for (i <- 0 until 2) { + sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort") + sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort") + sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort") + + sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')") + sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')") + sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('GLOBAL_SORT_PARTITIONS'='2')") +} +checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "global_sort") + +checkExistence(sql("DESCRIBE FORMATTED compaction_globalsort"), true, "city,name") + +sql("delete from table compaction_globalsort where SEGMENT.ID in (1,2,3)") +sql("delete from table carbon_localsort where SEGMENT.ID in (1,2,3)") +sql("ALTER TABLE compaction_globalsort COMPACT 'minor'") +checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), false, "Compacted") + +val segments = sql("SHOW SEGMENTS FOR TABLE compaction_globalsort") +val SegmentSequenceIds = segments.collect().map { each => (each.toSeq) (0) } +assert(!SegmentSequenceIds.contains("0.1")) +assert(SegmentSequenceIds.length == 6) + +checkAnswer(sql("SELECT COUNT(*) FROM compaction_globalsort"), Seq(Row(12))) + +checkAnswer(sql("SELECT * FROM compaction_globalsort"), + sql("SELECT * FROM carbon_localsort")) + +checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Success") +checkExistence(sql("SHOW SEGMENTS FOR TABLE compaction_globalsort"), true, "Marked for Delete") + CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_AUTO_LOAD_MERGE, +
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143776859 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortFunctionTest.scala --- @@ -0,0 +1,535 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the"License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an"AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, FilenameFilter} + +import org.apache.spark.sql.Row +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class CompactionSupportGlobalSortFunctionTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val filePath: String = s"$resourcesPath/globalsort" + val file1: String = resourcesPath + "/globalsort/sample1.csv" + val file2: String = resourcesPath + "/globalsort/sample2.csv" + val file3: String = resourcesPath + "/globalsort/sample3.csv" + + override def beforeEach { +resetConf +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("Compaction type: major") { +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort") + +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort") + +sql("ALTER TABLE compaction_globalsort COMPACT 'MAJOR'") --- End diff -- can you also configure the parameter for major compaction ---
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143775897 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, PrintWriter} + +import scala.util.Random + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val file1 = resourcesPath + "/compaction/fil1.csv" + val file2 = resourcesPath + "/compaction/fil2.csv" + val file3 = resourcesPath + "/compaction/fil3.csv" + val file4 = resourcesPath + "/compaction/fil4.csv" + val file5 = resourcesPath + "/compaction/fil5.csv" + + override protected def beforeAll(): Unit = { +resetConf("10") +//n should be about 500 of reset if size is default 1024 +val n = 15 +CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0) +CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n) +CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5) +CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8) +CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13) + } + + override protected def afterAll(): Unit = { +CompactionSupportGlobalSortBigFileTest.deleteFile(file1) +CompactionSupportGlobalSortBigFileTest.deleteFile(file2) +CompactionSupportGlobalSortBigFileTest.deleteFile(file3) +CompactionSupportGlobalSortBigFileTest.deleteFile(file4) +CompactionSupportGlobalSortBigFileTest.deleteFile(file5) +resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE) + } + + override def beforeEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("Compaction major: segments size is bigger than default compaction size") { +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')") + +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH
[GitHub] carbondata pull request #1361: [CARBONDATA-1481] Add test cases for compacti...
Github user jackylk commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1361#discussion_r143775848 --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/CompactionSupportGlobalSortBigFileTest.scala --- @@ -0,0 +1,136 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.testsuite.datacompaction + +import java.io.{File, PrintWriter} + +import scala.util.Random + +import org.apache.spark.sql.test.util.QueryTest +import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +class CompactionSupportGlobalSortBigFileTest extends QueryTest with BeforeAndAfterEach with BeforeAndAfterAll { + val file1 = resourcesPath + "/compaction/fil1.csv" + val file2 = resourcesPath + "/compaction/fil2.csv" + val file3 = resourcesPath + "/compaction/fil3.csv" + val file4 = resourcesPath + "/compaction/fil4.csv" + val file5 = resourcesPath + "/compaction/fil5.csv" + + override protected def beforeAll(): Unit = { +resetConf("10") +//n should be about 500 of reset if size is default 1024 +val n = 15 +CompactionSupportGlobalSortBigFileTest.createFile(file1, n, 0) +CompactionSupportGlobalSortBigFileTest.createFile(file2, n * 4, n) +CompactionSupportGlobalSortBigFileTest.createFile(file3, n * 3, n * 5) +CompactionSupportGlobalSortBigFileTest.createFile(file4, n * 2, n * 8) +CompactionSupportGlobalSortBigFileTest.createFile(file5, n * 2, n * 13) + } + + override protected def afterAll(): Unit = { +CompactionSupportGlobalSortBigFileTest.deleteFile(file1) +CompactionSupportGlobalSortBigFileTest.deleteFile(file2) +CompactionSupportGlobalSortBigFileTest.deleteFile(file3) +CompactionSupportGlobalSortBigFileTest.deleteFile(file4) +CompactionSupportGlobalSortBigFileTest.deleteFile(file5) +resetConf(CarbonCommonConstants.DEFAULT_MAJOR_COMPACTION_SIZE) + } + + override def beforeEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql( + """ +| CREATE TABLE compaction_globalsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' +| TBLPROPERTIES('SORT_COLUMNS'='city,name', 'SORT_SCOPE'='GLOBAL_SORT') + """.stripMargin) + +sql("DROP TABLE IF EXISTS carbon_localsort") +sql( + """ +| CREATE TABLE carbon_localsort(id INT, name STRING, city STRING, age INT) +| STORED BY 'org.apache.carbondata.format' + """.stripMargin) + } + + override def afterEach { +sql("DROP TABLE IF EXISTS compaction_globalsort") +sql("DROP TABLE IF EXISTS carbon_localsort") + } + + test("Compaction major: segments size is bigger than default compaction size") { +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE carbon_localsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file5' INTO TABLE carbon_localsort OPTIONS('header'='false')") + +sql(s"LOAD DATA LOCAL INPATH '$file1' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file3' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH '$file4' INTO TABLE compaction_globalsort OPTIONS('header'='false')") +sql(s"LOAD DATA LOCAL INPATH
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1359 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1045/ ---
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1359 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/415/ ---
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1359 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/291/ ---
[GitHub] carbondata issue #1409: [WIP][CARBODNATA-1377] support hive partition
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1409 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1044/ ---
[GitHub] carbondata issue #1408: [CARBONDATA-1568] Optimize annotation of code
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1408 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1043/ ---
[GitHub] carbondata issue #1409: [WIP][CARBODNATA-1377] support hive partition
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1409 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/414/ ---
[GitHub] carbondata issue #1409: [WIP][CARBODNATA-1377] support hive partition
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1409 Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/290/ ---
[GitHub] carbondata issue #1409: [CARBODNATA-1377] support hive partition
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1409 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/413/ ---
[GitHub] carbondata pull request #1409: [CARBODNATA-1377] support hive partition
GitHub user cenyuhai opened a pull request: https://github.com/apache/carbondata/pull/1409 [CARBODNATA-1377] support hive partition support hive partition ```sql create table if not exists temp.hash_partition_table(col_A String) partitioned by (col_B Long) stored by 'carbondata' tblproperties('partition_type'='HIVE') ``` ```sql alter table rtestpartition set fileformat carbondata; ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/cenyuhai/incubator-carbondata CARBONDATA-1377 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1409.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1409 commit 95c0491eb0a16fa4ec77a2960c80a5e333a1b2b1 Author: CenYuhaiDate: 2017-10-10T14:18:04Z support hive partition ---
[jira] [Assigned] (CARBONDATA-1377) Implement hive partition
[ https://issues.apache.org/jira/browse/CARBONDATA-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai reassigned CARBONDATA-1377: - Assignee: cen yuhai > Implement hive partition > > > Key: CARBONDATA-1377 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1377 > Project: CarbonData > Issue Type: Sub-task > Components: hive-integration >Reporter: cen yuhai >Assignee: cen yuhai > > Current partition implement is like database, If I want to use carbon to > replace parquet massively, we must make the usage of carbon the same with > parquet/orc. > Hive users should able to switch to CarbonData for all the new partitions > being created. Hive support format to be specified at partition level. > Example: > {code:sql} > create table rtestpartition (col1 string, col2 int) partitioned by (col3 int) > stored as parquet; > insert into rtestpartition partition(col3=10) select "pqt", 1; > insert into rtestpartition partition(col3=20) select "pqt", 1; > insert into rtestpartition partition(col3=10) select "pqt", 1; > insert into rtestpartition partition(col3=20) select "pqt", 1; > {code} > {noformat} > hive creates folder like > /db1/table1/col3=10/0001_file.pqt > /db1/table1/col3=10/0002_file.pqt > /db1/table1/col3=20/0001_file.pqt > /db1/table1/col3=20/0002_file.pqt > {noformat} > Hive users can now change new partitions to CarbonData, how ever old > partitions still be with parquet and require migration scripts to move to > CarbonData format. > {code:sql} > alter table rtestpartition set fileformat carbondata; > insert into rtestpartition partition(col3=30) select "cdata", 1; > insert into rtestpartition partition(col3=40) select "cdata", 1; > {code} > {noformat} > hive creates folder like > /db1/table1/col3=10/0001_file.pqt > /db1/table1/col3=10/0002_file.pqt > /db1/table1/col3=20/0001_file.pqt > /db1/table1/col3=20/0002_file.pqt > /db1/table1/col3=30/ > /db1/table1/col3=40/ > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user sounakr commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143741467 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java --- @@ -219,4 +225,27 @@ public DataMapMeta getMeta() { // TODO: pass SORT_COLUMNS into this class return null; } + + @Override public SegmentProperties getSegmentProperties(String segmentId) throws IOException { +SegmentProperties segmentProperties = null; --- End diff -- Done. ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1407 SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1042/ ---
[GitHub] carbondata issue #1408: [CARBONDATA-1568] Optimize annotation of code
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1408 Build Failed with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/289/ ---
[GitHub] carbondata issue #1359: [CARBONDATA-1480]Min Max Index Example for DataMap
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1359 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1041/ ---
[GitHub] carbondata issue #1408: [CARBONDATA-1568] Optimize annotation of code
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1408 Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/412/ ---
[GitHub] carbondata pull request #1408: [CARBONDATA-1568] Optimize annotation of code
GitHub user xubo245 opened a pull request: https://github.com/apache/carbondata/pull/1408 [CARBONDATA-1568] Optimize annotation of code There are some improper places in code annotation by IDEA inspect code We optimize annotation of code: - Fix an error of Javadoc issues - modify the expression of annotation You can merge this pull request into a Git repository by running: $ git pull https://github.com/xubo245/carbondata docsOptimize Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1408.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1408 commit 5fcfcf5888e0fc1d35a8f211b10090d0d9d6c39f Author: xubo245 <601450...@qq.com> Date: 2017-10-10T12:59:46Z [CARBONDATA-1568] Optimize annotation of code ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user sounakr commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143721417 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxDataMapFactory.java --- @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.apache.carbondata.core.cache.Cache; +import org.apache.carbondata.core.cache.CacheProvider; +import org.apache.carbondata.core.cache.CacheType; +import org.apache.carbondata.core.datamap.DataMapDistributable; +import org.apache.carbondata.core.datamap.DataMapMeta; +import org.apache.carbondata.core.datamap.TableDataMap; +import org.apache.carbondata.core.datamap.dev.DataMap; +import org.apache.carbondata.core.datamap.dev.DataMapFactory; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.events.ChangeEvent; +import org.apache.carbondata.core.indexstore.TableBlockIndexUniqueIdentifier; +import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMap; +import org.apache.carbondata.core.indexstore.schema.FilterType; +import org.apache.carbondata.core.memory.MemoryException; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; + + +/** + * Min Max DataMap Factory + */ +public class MinMaxDataMapFactory implements DataMapFactory { + + private AbsoluteTableIdentifier identifier; + + // segmentId -> list of index file + private MapsegmentMap = new HashMap<>(); + + @Override + public void init(AbsoluteTableIdentifier identifier, String dataMapName) { +this.identifier = identifier; + } + + /** + * createWriter will return the MinMaxDataWriter. + * @param segmentId + * @return + */ + @Override + public DataMapWriter createWriter(String segmentId) { +return new MinMaxDataWriter(); + } + + /** + * getDataMaps Factory method Initializes the Min Max Data Map and returns. + * @param segmentId + * @return + * @throws IOException + */ + @Override + public List getDataMaps(String segmentId) throws IOException { +List tableBlockIndexUniqueIdentifiers = +segmentMap.get(segmentId); +List dataMapList = new ArrayList<>(); +if (tableBlockIndexUniqueIdentifiers == null) { + tableBlockIndexUniqueIdentifiers = new ArrayList<>(); + CarbonFile[] listFiles = getCarbonIndexFiles(segmentId); + for (int i = 0; i < listFiles.length; i++) { +tableBlockIndexUniqueIdentifiers.add( +new TableBlockIndexUniqueIdentifier(identifier, segmentId, listFiles[i].getName())); + } +} +// Form a dataMap of Type MinMaxDataMap. +MinMaxDataMap dataMap = new MinMaxDataMap(); +try { + dataMap.init(tableBlockIndexUniqueIdentifiers.get(0).getFilePath()); +} catch (MemoryException ex) { + +} +dataMapList.add(dataMap); +return dataMapList; + } + + /** + * Routine to retrieve the carbonIndex. + * @param segmentId + * @return + */ + private CarbonFile[] getCarbonIndexFiles(String segmentId) { --- End diff -- removed ---
[GitHub] carbondata issue #1404: [CARBONDATA-1541] There are some errors when bad_rec...
Github user xubo245 commented on the issue: https://github.com/apache/carbondata/pull/1404 Please review it @jackylk ---
[GitHub] carbondata issue #1404: [CARBONDATA-1541] There are some errors when bad_rec...
Github user ravipesala commented on the issue: https://github.com/apache/carbondata/pull/1404 SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1040/ ---
[jira] [Assigned] (CARBONDATA-1500) Support Alter table to add and remove Array column
[ https://issues.apache.org/jira/browse/CARBONDATA-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhatchayani reassigned CARBONDATA-1500: --- Assignee: dhatchayani > Support Alter table to add and remove Array column > -- > > Key: CARBONDATA-1500 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1500 > Project: CarbonData > Issue Type: Sub-task > Components: core >Reporter: Venkata Ramana G >Assignee: dhatchayani >Priority: Minor > > implement DDL and requires default value handling. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] carbondata pull request #1398: [CARBONDATA-1537] Fixed version compatabilty ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1398#discussion_r143715262 --- Diff: core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java --- @@ -238,6 +278,20 @@ public void setNumberOfPages(int numberOfPages) { for (int i = 0; i < measureChunkOffsetsSize; i++) { measureChunksLength.add(input.readInt()); } - +// Deserialize datachunks as well for older versions like V1 and V2 --- End diff -- ok ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user sounakr commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143714831 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxDataMap.java --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples; + +import java.io.BufferedReader; +import java.io.DataInputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.ArrayList; +import java.util.BitSet; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.Cacheable; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.dev.DataMap; +import org.apache.carbondata.core.datastore.IndexKey; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.fileoperations.AtomicFileOperations; +import org.apache.carbondata.core.fileoperations.AtomicFileOperationsImpl; +import org.apache.carbondata.core.indexstore.Blocklet; +import org.apache.carbondata.core.memory.MemoryException; +import org.apache.carbondata.core.scan.filter.FilterUtil; +import org.apache.carbondata.core.scan.filter.executer.FilterExecuter; +import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf; +import org.apache.carbondata.core.util.CarbonUtil; + +import com.google.gson.Gson; + +/** + * Datamap implementation for min max blocklet. + */ +public class MinMaxDataMap implements DataMap, Cacheable { --- End diff -- Done ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user sounakr commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143714764 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxDataMap.java --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples; + +import java.io.BufferedReader; +import java.io.DataInputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.ArrayList; +import java.util.BitSet; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.Cacheable; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.dev.DataMap; +import org.apache.carbondata.core.datastore.IndexKey; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.fileoperations.AtomicFileOperations; +import org.apache.carbondata.core.fileoperations.AtomicFileOperationsImpl; +import org.apache.carbondata.core.indexstore.Blocklet; +import org.apache.carbondata.core.memory.MemoryException; +import org.apache.carbondata.core.scan.filter.FilterUtil; +import org.apache.carbondata.core.scan.filter.executer.FilterExecuter; +import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf; +import org.apache.carbondata.core.util.CarbonUtil; + +import com.google.gson.Gson; + +/** + * Datamap implementation for min max blocklet. + */ +public class MinMaxDataMap implements DataMap, Cacheable { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(MinMaxDataMap.class.getName()); + + public static final String NAME = "clustered.minmax.btree.blocklet"; + + private String filePath; + + private MinMaxIndexBlockDetails[] readMinMaxDataMap; + + @Override public void init(String filePath) throws MemoryException, IOException { +this.filePath = filePath; +CarbonFile[] listFiles = getCarbonIndexFiles(filePath, "0"); +for (int i = 0; i < listFiles.length; i++) { + readMinMaxDataMap = readJson(listFiles[i].getPath()); +} + } + + private CarbonFile[] getCarbonIndexFiles(String filePath, String segmentId) { --- End diff -- Done. ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user sounakr commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143714171 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxBlockletComparator.java --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.examples; + +import java.nio.ByteBuffer; +import java.util.Comparator; + +import org.apache.carbondata.core.util.ByteUtil; + + +/** + * Data map comparator + */ +public class MinMaxBlockletComparator implements Comparator{ --- End diff -- Removed ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1407 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/288/ ---
[GitHub] carbondata issue #1407: [CARBONDATA-1549] CarbonProperties should be default...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1407 Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/411/ ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143711091 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxDataMap.java --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples; + +import java.io.BufferedReader; +import java.io.DataInputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.ArrayList; +import java.util.BitSet; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.cache.Cacheable; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datamap.dev.DataMap; +import org.apache.carbondata.core.datastore.IndexKey; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.fileoperations.AtomicFileOperations; +import org.apache.carbondata.core.fileoperations.AtomicFileOperationsImpl; +import org.apache.carbondata.core.indexstore.Blocklet; +import org.apache.carbondata.core.memory.MemoryException; +import org.apache.carbondata.core.scan.filter.FilterUtil; +import org.apache.carbondata.core.scan.filter.executer.FilterExecuter; +import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf; +import org.apache.carbondata.core.util.CarbonUtil; + +import com.google.gson.Gson; + +/** + * Datamap implementation for min max blocklet. + */ +public class MinMaxDataMap implements DataMap, Cacheable { --- End diff -- don't implement Cacheable ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143711222 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxDataMapFactory.java --- @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.examples; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.apache.carbondata.core.cache.Cache; +import org.apache.carbondata.core.cache.CacheProvider; +import org.apache.carbondata.core.cache.CacheType; +import org.apache.carbondata.core.datamap.DataMapDistributable; +import org.apache.carbondata.core.datamap.DataMapMeta; +import org.apache.carbondata.core.datamap.TableDataMap; +import org.apache.carbondata.core.datamap.dev.DataMap; +import org.apache.carbondata.core.datamap.dev.DataMapFactory; +import org.apache.carbondata.core.datamap.dev.DataMapWriter; +import org.apache.carbondata.core.datastore.filesystem.CarbonFile; +import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter; +import org.apache.carbondata.core.datastore.impl.FileFactory; +import org.apache.carbondata.core.events.ChangeEvent; +import org.apache.carbondata.core.indexstore.TableBlockIndexUniqueIdentifier; +import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMap; +import org.apache.carbondata.core.indexstore.schema.FilterType; +import org.apache.carbondata.core.memory.MemoryException; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; + + +/** + * Min Max DataMap Factory + */ +public class MinMaxDataMapFactory implements DataMapFactory { + + private AbsoluteTableIdentifier identifier; + + // segmentId -> list of index file + private MapsegmentMap = new HashMap<>(); + + @Override + public void init(AbsoluteTableIdentifier identifier, String dataMapName) { +this.identifier = identifier; + } + + /** + * createWriter will return the MinMaxDataWriter. + * @param segmentId + * @return + */ + @Override + public DataMapWriter createWriter(String segmentId) { +return new MinMaxDataWriter(); + } + + /** + * getDataMaps Factory method Initializes the Min Max Data Map and returns. + * @param segmentId + * @return + * @throws IOException + */ + @Override + public List getDataMaps(String segmentId) throws IOException { +List tableBlockIndexUniqueIdentifiers = +segmentMap.get(segmentId); +List dataMapList = new ArrayList<>(); +if (tableBlockIndexUniqueIdentifiers == null) { + tableBlockIndexUniqueIdentifiers = new ArrayList<>(); + CarbonFile[] listFiles = getCarbonIndexFiles(segmentId); + for (int i = 0; i < listFiles.length; i++) { +tableBlockIndexUniqueIdentifiers.add( +new TableBlockIndexUniqueIdentifier(identifier, segmentId, listFiles[i].getName())); + } +} +// Form a dataMap of Type MinMaxDataMap. +MinMaxDataMap dataMap = new MinMaxDataMap(); +try { + dataMap.init(tableBlockIndexUniqueIdentifiers.get(0).getFilePath()); +} catch (MemoryException ex) { + +} +dataMapList.add(dataMap); +return dataMapList; + } + + /** + * Routine to retrieve the carbonIndex. + * @param segmentId + * @return + */ + private CarbonFile[] getCarbonIndexFiles(String segmentId) { --- End diff -- why this method required ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143710556 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MinMaxBlockletComparator.java --- @@ -0,0 +1,134 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.examples; + +import java.nio.ByteBuffer; +import java.util.Comparator; + +import org.apache.carbondata.core.util.ByteUtil; + + +/** + * Data map comparator + */ +public class MinMaxBlockletComparator implements Comparator{ --- End diff -- I think this class is not required ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143708431 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/SegmentPropertiesFetcher.java --- @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.indexstore; + +import java.io.IOException; + +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; + +public interface SegmentPropertiesFetcher { + + SegmentProperties getSegmentProperties(String filePath) throws IOException; + + SegmentProperties getSegmentProperties(AbsoluteTableIdentifier absoluteTableIdentifier); --- End diff -- Don't add if it is not used ---
[GitHub] carbondata pull request #1359: [CARBONDATA-1480]Min Max Index Example for Da...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1359#discussion_r143708356 --- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/SegmentPropertiesFetcher.java --- @@ -0,0 +1,30 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.core.indexstore; + +import java.io.IOException; + +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier; + +public interface SegmentPropertiesFetcher { + + SegmentProperties getSegmentProperties(String filePath) throws IOException; --- End diff -- I thnk it is segmentID ---
[GitHub] carbondata issue #1404: [CARBONDATA-1541] There are some errors when bad_rec...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/1404 Build Success with Spark 1.6, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/286/ ---
[jira] [Created] (CARBONDATA-1567) Make the tests of common-test works with 2.2
Ravindra Pesala created CARBONDATA-1567: --- Summary: Make the tests of common-test works with 2.2 Key: CARBONDATA-1567 URL: https://issues.apache.org/jira/browse/CARBONDATA-1567 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1565) Resolve miscellinous issues and make query runs on spark2.2
Ravindra Pesala created CARBONDATA-1565: --- Summary: Resolve miscellinous issues and make query runs on spark2.2 Key: CARBONDATA-1565 URL: https://issues.apache.org/jira/browse/CARBONDATA-1565 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1564) Resolve compilation issues of CarbonFileMetastore.
Ravindra Pesala created CARBONDATA-1564: --- Summary: Resolve compilation issues of CarbonFileMetastore. Key: CARBONDATA-1564 URL: https://issues.apache.org/jira/browse/CARBONDATA-1564 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1562) Expose metadataHive to in CarbonSessionStateBuilder to run native hive sql.
Ravindra Pesala created CARBONDATA-1562: --- Summary: Expose metadataHive to in CarbonSessionStateBuilder to run native hive sql. Key: CARBONDATA-1562 URL: https://issues.apache.org/jira/browse/CARBONDATA-1562 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1563) Resolve compilation issues of CarbonSparkSqlParser.
Ravindra Pesala created CARBONDATA-1563: --- Summary: Resolve compilation issues of CarbonSparkSqlParser. Key: CARBONDATA-1563 URL: https://issues.apache.org/jira/browse/CARBONDATA-1563 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1560) Resolve compilation issues of CarbonSQLConf
Ravindra Pesala created CARBONDATA-1560: --- Summary: Resolve compilation issues of CarbonSQLConf Key: CARBONDATA-1560 URL: https://issues.apache.org/jira/browse/CARBONDATA-1560 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1561) Resolve compilation issues of CarbonSpark2SqlParser
Ravindra Pesala created CARBONDATA-1561: --- Summary: Resolve compilation issues of CarbonSpark2SqlParser Key: CARBONDATA-1561 URL: https://issues.apache.org/jira/browse/CARBONDATA-1561 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1559) Resolve compilation issues of CarbonAnalysisRules
Ravindra Pesala created CARBONDATA-1559: --- Summary: Resolve compilation issues of CarbonAnalysisRules Key: CARBONDATA-1559 URL: https://issues.apache.org/jira/browse/CARBONDATA-1559 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1558) Resolve DescribeTableCommand compilation issues in DDL strategy class.we may can use implicits to make work in both spark 2.1 and 2.2
Ravindra Pesala created CARBONDATA-1558: --- Summary: Resolve DescribeTableCommand compilation issues in DDL strategy class.we may can use implicits to make work in both spark 2.1 and 2.2 Key: CARBONDATA-1558 URL: https://issues.apache.org/jira/browse/CARBONDATA-1558 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1557) Use implicit to resolve the compilation errors of Cast class which is used many places in carbon.we may can use implicits to make work in both spark 2.1 and 2.2
Ravindra Pesala created CARBONDATA-1557: --- Summary: Use implicit to resolve the compilation errors of Cast class which is used many places in carbon.we may can use implicits to make work in both spark 2.1 and 2.2 Key: CARBONDATA-1557 URL: https://issues.apache.org/jira/browse/CARBONDATA-1557 Project: CarbonData Issue Type: Sub-task Reporter: Ravindra Pesala -- This message was sent by Atlassian JIRA (v6.4.14#64029)