[GitHub] carbondata issue #2394: [CARBONDATA- 2243] Added test case for database and ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2394 retest this please ---
[GitHub] carbondata issue #2095: [CARBONDATA-2273] Added sdv test cases for boolean f...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2095 retest this please ---
[GitHub] carbondata issue #2047: [CARBONDATA-2240] Refactored TestPreaggregateExpress...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2047 retest this please ---
[GitHub] carbondata pull request #2077: [CARBONDATA-2263] Fixed bug for incorrect dat...
Github user SangeetaGulia closed the pull request at: https://github.com/apache/carbondata/pull/2077 ---
[GitHub] carbondata pull request #1661: [CARBONDATA-1678] Fixed incorrect partitionCo...
Github user SangeetaGulia closed the pull request at: https://github.com/apache/carbondata/pull/1661 ---
[GitHub] carbondata issue #2077: [CARBONDATA-2263] Fixed bug for incorrect data in da...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2077 retest this please. ---
[GitHub] carbondata issue #2095: [CARBONDATA-2273] Added sdv test cases for boolean f...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2095 retest this please. ---
[GitHub] carbondata issue #2046: [CARBONDATA-2239] Added sdv test cases for querying ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2046 retest this please ---
[GitHub] carbondata issue #1942: [CARBONDATA-2136] Fixed bug related to data load for...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1942 retest this please ---
[GitHub] carbondata issue #2077: [CARBONDATA-2263] Fixed bug for incorrect data in da...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2077 retest this please ---
[GitHub] carbondata pull request #2095: [CARBONDATA-2273] Added sdv test cases for bo...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/2095 [CARBONDATA-2273] Added sdv test cases for boolean feature Added sdv test cases for boolean feature. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? Added SDV test cases for boolean feature. - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata SDVBooleanTestCases Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2095.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2095 commit 79364412ca3d6354f73deb15c18a534449ea9ed4 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-03-22T13:06:33Z Added sdv test cases for boolean feature ---
[GitHub] carbondata pull request #1950: [CARBONDATA-2145] Refactored PreAggregate fun...
Github user SangeetaGulia closed the pull request at: https://github.com/apache/carbondata/pull/1950 ---
[GitHub] carbondata issue #2077: [CARBONDATA-2263] Fixed bug for incorrect data in da...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2077 retest this please. ---
[GitHub] carbondata issue #2077: [CARBONDATA-2263] Fixed bug for incorrect data in da...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2077 retest this please. ---
[GitHub] carbondata pull request #2077: [CARBONDATA-2263] Fixed bug for incorrect dat...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/2077 [CARBONDATA-2263] Fixed bug for incorrect data in date type with different format Description: This PR handles different date formats and load data correctly for date format. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? Added Test Cases. - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-2263 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2077.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2077 commit 240975f574d825f9845b59668c70977be3f0c03e Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-03-19T10:16:16Z Fixed bug for incorrect data in date type with different format ---
[GitHub] carbondata issue #2048: [CARBONDATA-2241][Docs][BugFix] Updated Doc for quer...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/2048 retest this please. ---
[GitHub] carbondata pull request #2048: [CARBONDATA-2241][Docs][BugFix] Updated Doc f...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/2048 [CARBONDATA-2241][Docs][BugFix] Updated Doc for query which will execute on datamap Description: Below query is written in document: `SELECT sum(price), country from sales GROUP BY country` and it is said that it will execute on datamap, but it will execute with main table and not datamap. Fix: Corrected the query so that it will execute using datamap. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? NA - [x] Any backward compatibility impacted? NA - [x] Document update required? NA - [x] Testing done NA Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata PreaggregateDocChange Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2048.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2048 commit cf2120bccd91fa91ef9aecb9dd62f0d6caba5436 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-03-09T05:43:45Z Updated Doc for query which will execute on datamap ---
[GitHub] carbondata pull request #2047: [CARBONDATA-2240] Refactored TestPreaggregate...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/2047 [CARBONDATA-2240] Refactored TestPreaggregateExpressions to remove duplicate test case to improve CI Time Description: This PR includes the following improvements on Preaggregate expressions and selection scenario: 1) Refactor UT's to remove duplicate test scenarios to improve CI time. 2) Refactor test case for duplicate code in different class 3) Correcting test case for missing assert Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? NA - How it is tested? Please attach test report. Ran all test cases. - Is it a performance related change? Please attach the performance test report. NA - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata refactorPreaggregateUT-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2047.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2047 commit e705b8f2cd12a333c80888efc634c0ff6302756c Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-03-06T10:37:03Z Refactored TestPreAggregateExpressions to remove duplicate test case commit baf5c00951f7e4d5cf282588631ef3426e513cf2 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-03-08T11:14:26Z Refactored test case for missing assert and duplicate code ---
[GitHub] carbondata pull request #2035: [CARBONDATA-2226] Removed redundant and unnec...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/2035 [CARBONDATA-2226] Removed redundant and unnecessary test cases to improve CI time for PreAggregation Create and Drop datamap feature Description: Removed redundant and unnecessary test cases to improve CI time for PreAggregation Create and Drop datamap feature Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? NA - [x] Any backward compatibility impacted? NA - [x] Document update required? NA - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. NA You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata refactorPreaggregateUT Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/2035.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2035 commit 9fcad19b9ea88d30ad7f09307a9ca7819bd694d7 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-03-06T05:33:38Z Removed redundant and unnecessary test cases to improve CI time for PreAggregation Create and Drop datamap feature ---
[GitHub] carbondata issue #1950: [CARBONDATA-2145] Refactored PreAggregate functional...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1950 @kumarvishal09 please review. ---
[GitHub] carbondata pull request #1950: [CARBONDATA-2145] Refactored PreAggregate fun...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1950 [CARBONDATA-2145] Refactored PreAggregate functionality for dictionary include Description: Add the count to measure column only when the column is dictionary type in maintable. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? Ran already wriiten Unit Test case to test the functionality. Added Unit Test Case to check count on string type. - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (N/A) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata refactoringPreAgg Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1950.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1950 commit 576eb2e8997420125e90afbf85b37ca9cf9429ef Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-02-07T08:15:29Z Refactored code for encodings when dictionary_include is present commit 71ab2208aa0a45b9504f6dd9245b23d5f996 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-02-07T08:16:11Z Added test case on preAggregate for string type with count as aggregate function ---
[GitHub] carbondata pull request #1917: [CARBONDATA-2105] Fixed bug for null values w...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1917 [CARBONDATA-2105] Fixed bug for null values when group by column is present as dictionary_include 1) Refactored code to resolve issue of null values when group by column is present as dictionary_include. 2) Added related test case. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? Yes - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (NA) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-2105 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1917.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1917 commit 9755b53e8f91d0e609b45bc155044f1cab278b5f Author: SangeetaGulia <sangeeta.gulia@...> Date: 2018-02-02T17:11:24Z Fixed bug for null values when group by column is present as dictionary_include ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia closed the pull request at: https://github.com/apache/carbondata/pull/1584 ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1584#discussion_r161513824 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/S3Example.scala --- @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, SECRET_KEY} +import org.apache.spark.sql.SparkSession +import org.slf4j.{Logger, LoggerFactory} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +object S3Example { + + /** + * This example demonstrate usage of s3 as a store. + * + * @param args require three parameters "Access-key" "Secret-key" + * "s3 bucket path" + */ + + def main(args: Array[String]) { +val rootPath = new File(this.getClass.getResource("/").getPath ++ "../../../..").getCanonicalPath +val warehouse = s"$rootPath/examples/spark2/target/warehouse" +val path = s"$rootPath/examples/spark2/src/main/resources/data1.csv" +val logger: Logger = LoggerFactory.getLogger(this.getClass) +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd HH:mm:ss") + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true") + .addProperty(CarbonCommonConstants.DEFAULT_CARBON_MAJOR_COMPACTION_SIZE, "0.02") + +import org.apache.spark.sql.CarbonSession._ +if (args.length != 3) { + logger.error("Usage: java CarbonS3Example " + + "") + System.exit(0) +} + +val (accessKey, secretKey) = getKeyOnPrefix(args(2)) +val spark = SparkSession + .builder() + .master("local") + .appName("CarbonSessionExample") + .config("spark.sql.warehouse.dir", warehouse) + .config("spark.driver.host", "localhost") + .config(accessKey, args(0)) + .config(secretKey, args(1)) + .getOrCreateCarbonSession(args(2), warehouse) + +spark.sparkContext.setLogLevel("INFO") + +spark.sql( + s""" + | CREATE TABLE if not exists carbon_table( + | shortField SHORT, + | intField INT, + | bigintField LONG, + | doubleField DOUBLE, + | stringField STRING, + | timestampField TIMESTAMP, + | decimalField DECIMAL(18,2), + | dateField DATE, + | charField CHAR(5), + | floatField FLOAT + | ) + | STORED BY 'carbondata' --- End diff -- We have taken entire location as a command line argument as the same example can be used for s3n as well. ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1584#discussion_r161513550 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/S3Example.scala --- @@ -0,0 +1,152 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, SECRET_KEY} +import org.apache.spark.sql.SparkSession +import org.slf4j.{Logger, LoggerFactory} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +object S3Example { + + /** + * This example demonstrate usage of s3 as a store. + * + * @param args require three parameters "Access-key" "Secret-key" + * "s3 bucket path" + */ + + def main(args: Array[String]) { +val rootPath = new File(this.getClass.getResource("/").getPath ++ "../../../..").getCanonicalPath +val warehouse = s"$rootPath/examples/spark2/target/warehouse" +val path = s"$rootPath/examples/spark2/src/main/resources/data1.csv" +val logger: Logger = LoggerFactory.getLogger(this.getClass) +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd HH:mm:ss") + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true") + .addProperty(CarbonCommonConstants.DEFAULT_CARBON_MAJOR_COMPACTION_SIZE, "0.02") + +import org.apache.spark.sql.CarbonSession._ +if (args.length != 3) { + logger.error("Usage: java CarbonS3Example " + + "") + System.exit(0) +} + +val (accessKey, secretKey) = getKeyOnPrefix(args(2)) +val spark = SparkSession + .builder() + .master("local") + .appName("CarbonSessionExample") + .config("spark.sql.warehouse.dir", warehouse) + .config("spark.driver.host", "localhost") + .config(accessKey, args(0)) + .config(secretKey, args(1)) + .getOrCreateCarbonSession(args(2), warehouse) --- End diff -- Examples are updated with location provided in create table command. ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1584#discussion_r161512456 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/S3CsvExample.scala --- @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, SECRET_KEY} +import org.apache.spark.sql.SparkSession +import org.slf4j.{Logger, LoggerFactory} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +object S3CsvExample { + + /** + * This example demonstrate to create local store having csv on s3. --- End diff -- Done. ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1584#discussion_r161512369 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/MultiStoreExample.scala --- @@ -0,0 +1,153 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.examples + +import java.io.File + +import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, SECRET_KEY} +import org.apache.spark.sql.SparkSession +import org.slf4j.{Logger, LoggerFactory} + +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.apache.carbondata.core.util.CarbonProperties + +object MultiStoreExample { + + /** This example demonstrate the usage of multiple filesystem(s3 and local) on one carbon session + * + * @param args represents "fs.s3a.access.key" "fs.s3a.secret.key" "bucket-name" + */ + + def main(args: Array[String]) { + +val rootPath = new File(this.getClass.getResource("/").getPath ++ "../../../..").getCanonicalPath +val storeLocation = s"$rootPath/examples/spark2/target/store" +val warehouse = s"$rootPath/examples/spark2/target/warehouse" +val metastoredb = s"$rootPath/examples/spark2/target" +val path = s"$rootPath/examples/spark2/src/main/resources/data.csv" +val logger: Logger = LoggerFactory.getLogger(this.getClass) + +if (args.length != 3) { + logger.error("Usage: java CarbonS3Example <fs.s3a.secret" + + ".key> ") + System.exit(0) +} + +CarbonProperties.getInstance() + .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd HH:mm:ss") + .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") + .addProperty(CarbonCommonConstants.ENABLE_UNSAFE_COLUMN_PAGE_LOADING, "true") + +import org.apache.spark.sql.CarbonSession._ +val spark = SparkSession + .builder() + .master("local") + .appName("CarbonSessionExample") + .config("spark.sql.warehouse.dir", warehouse) + .config("spark.driver.host", "localhost") + .config("spark.hadoop." + ACCESS_KEY, args(0)) + .config("spark.hadoop." + SECRET_KEY, args(1)) + .getOrCreateCarbonSession(storeLocation, warehouse) --- End diff -- refactored to provide store location in create table command. ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1584 @jackylk we have raised a new PR with all the review comments resolved [#1805](https://github.com/apache/carbondata/pull/1805). That PR is raised to be merged into the carbonstore branch. Please review the same. ---
[GitHub] carbondata pull request #1787: [CARBONDATA-2017] Fix input path checking whe...
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1787#discussion_r160882178 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala --- @@ -73,7 +73,8 @@ object FileUtils { val stringBuild = new StringBuilder() val filePaths = inputPath.split(",") for (i <- 0 until filePaths.size) { -val fileType = FileFactory.getFileType(filePaths(i)) +val filePath = CarbonUtil.checkAndAppendHDFSUrl(filePaths(i)) --- End diff -- @jackylk I have verified this. It is working fine with S3 also. We will now be able to use the carbon property **carbon.ddl.base.hdfs.url** for s3 also to provide base URL. ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1584 @jackylk we have made the changes as per your review comments. Can you please check. ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1584 @jackylk we have made the changes as per your review comments. Also, CI build is failing due to some Jenkins issue : Build finished. 'FileNotFoundException means that the credentials Jenkins is using are probably wrong. Or the user account does not have write access to the repo. Can you please check. ---
[GitHub] carbondata issue #1661: [CARBONDATA-1678] Fixed incorrect partitionCount on ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1661 retest this please ---
[GitHub] carbondata pull request #1722: [CARBONDATA-1755] Fixed bug occuring on concu...
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1722#discussion_r160086580 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForUpdateCommand.scala --- @@ -52,6 +53,13 @@ private[sql] case class CarbonProjectForUpdateCommand( return Seq.empty } val carbonTable = CarbonEnv.getCarbonTable(databaseNameOp, tableName)(sparkSession) +val isLoadInProgress = SegmentStatusManager.checkIfAnyLoadInProgressForTable(carbonTable) +if (isLoadInProgress) { + LOGGER.error("Cannot run data loading and update on same table concurrently. Please wait" + + " for load to finish") + throw new Exception("Cannot run data loading and update on same table concurrently. " + --- End diff -- @jackylk I will update this code once #1711 will be merged. ---
[GitHub] carbondata issue #1711: [CARBONDATA-1754][BugFix] Fixed issue occuring on co...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1711 @jackylk I have refactored the code in java and moved it to spark-common. Please review. ---
[GitHub] carbondata issue #1711: [CARBONDATA-1754][BugFix] Fixed issue occuring on co...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1711 retest this please ---
[GitHub] carbondata issue #1711: [CARBONDATA-1754][BugFix] Fixed issue occuring on co...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1711 @jackylk I have added its description. ---
[GitHub] carbondata issue #1722: [CARBONDATA-1755] Fixed bug occuring on concurrent i...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1722 retest this please ---
[GitHub] carbondata issue #1697: [CARBONDATA-1719][Pre-Aggregate][Bug] Fixed bug to h...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1697 retest this please ---
[GitHub] carbondata issue #1722: [CARBONDATA-1755] Fixed bug occuring on current inse...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1722 retest sdv please. ---
[GitHub] carbondata pull request #1722: [CARBONDATA-1755] Fixed bug occuring on curre...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1722 [CARBONDATA-1755] Fixed bug occuring on current insert-overwrite and update Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1755 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1722.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1722 commit c765ff3d8d33f70b433d6f97ea1cc02c6262b228 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2017-12-22T11:08:10Z Fixed bug occuring on current insert-overwrite and update ---
[GitHub] carbondata pull request #1711: [CARBONDATA-1754][BugFix] Fixed issue occurin...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1711 [CARBONDATA-1754][BugFix] Fixed issue occuring on concurrent insert-overwrite and compaction Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [ x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? Added test case to check its functionality. I have also tested this issue manually. - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (N/A) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1754 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1711.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1711 commit 963ff660c35764473d370d28b7494bc5135e2d44 Author: SangeetaGulia <sangeeta.gulia@...> Date: 2017-12-21T13:12:18Z Fixed issue occuring on concurrent insert-overwrite and compaction ---
[GitHub] carbondata pull request #1697: [CARBONDATA-1719] Fixed bug to handle data in...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1697 [CARBONDATA-1719] Fixed bug to handle data inconsistency on concurrent data load and pre-aggregate table creation Problem: On concurrent data load and pre-aggregate table creation, datamap was not getting populated with the load data even after data load completes. This PR fix this issue. Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. Tested the functionality on Two node cluster - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (N/A) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1719 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1697.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1697 commit 7e002e25e88516aed96fab82dac2afb9cba8fb7a Author: SangeetaGulia <sangeeta.gulia@...> Date: 2017-12-20T12:32:51Z Fixed bug to handle data inconsistency on concurrent data load and pre-aggregate table creation ---
[GitHub] carbondata issue #1661: [CARBONDATA-1678] Fixed incorrect partitionCount on ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1661 @lionelcao Please review. I have added the test case. ---
[GitHub] carbondata pull request #1661: [CARBONDATA-1678] Fixed incorrect partitionCo...
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1661#discussion_r157135341 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/util/PartitionUtils.scala --- @@ -113,7 +113,8 @@ object PartitionUtils { } if (partitionId == 0) { - partitionInfo.addPartition(splitInfo.size) + val addedPartitionList = PartitionUtils.getListInfo(splitInfo.mkString(",")) --- End diff -- Add partition does support String group. The problem in this bug is, when you are trying to add these two partitions "'SAUDI ARABIA,(VIETNAM,RUSSIA,UNITED KINGDOM,UNITED STATES)'" then splitInfo variable takes it as one partition and increment partition count by 1. To get the correct partitions out of it, we need to extract the list out of the splitList. As this is a sequence based test case, where the problem is encountered when we "show partition" after "alter table to add partition", So is it alright to add its test case to TestAlterPartitionTable.scala? ---
[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1658 @sraghunandan I have found two related tasks: 1) Create a system-level switch for supporting standard partition or carbon custom partition. 2) Support drop partition in carbon So are we continuing support of custom partitions? and if yes then are we going to provide the support of drop partition for hash partition table? Support drop partition in carbon -- Support drop partition in carbon -- Support drop partition in carbon -- Support drop partition in carbon -- Support drop partition in carbon -- ---
[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1658 Hi @lionelcao , Is there any plan to support alter partition operation for hash partition table in future? ---
[GitHub] carbondata issue #1661: [CARBONDATA-1678] Fixed incorrect partitionCount on ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1661 retest this please. ---
[GitHub] carbondata issue #1658: [CARBONDATA-1680] Fixed Bug to show partition Ids fo...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1658 I have displayed the partition_ids due to the bug raised in this jira https://issues.apache.org/jira/browse/CARBONDATA-1680 which says "Show Partition for Hash Partition doesn't display the partition id" ---
[GitHub] carbondata pull request #1661: Fixed incorrect partitionCount on Alter Parti...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1661 Fixed incorrect partitionCount on Alter Partition Table Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (N/A) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1678 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1661.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1661 commit 4d23490759cef49f7dddb94efb4bec1ef685d0d8 Author: SangeetaGulia <sangeeta.gu...@knoldus.in> Date: 2017-12-14T11:40:09Z Fixed incorrect partitionCount on Alter Partition Table ---
[GitHub] carbondata pull request #1658: [CARBONDATA-1680] Fixed Bug to show partition...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1658 [CARBONDATA-1680] Fixed Bug to show partition Ids for Hash Partition Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? No - [x] Any backward compatibility impacted? No - [x] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (N/A) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1680 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1658.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1658 commit 807474ea179efcc2528867221e039b66a9948ad7 Author: SangeetaGulia <sangeeta.gu...@knoldus.in> Date: 2017-12-14T06:04:02Z Fixed Bug to show partition Ids for Hash Partition ---
[GitHub] carbondata issue #1622: [CARBONDATA-1865] Refactored code to skip single-pas...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1622 retest this please ---
[GitHub] carbondata pull request #1622: Refactored code to skip single-pass for first...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1622 Refactored code to skip single-pass for first data load Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [x] Any interfaces changed? (N/A) - [x] Any backward compatibility impacted? (N/A) - [x] Document update required? (N/A) - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [x] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. (N/A) You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1865 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1622.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1622 commit 3e3100b64b89e54e95a15306fe8d7b40f1ceee01 Author: SangeetaGulia <sangeeta.gu...@knoldus.in> Date: 2017-12-06T09:48:00Z Refactored code to skip single-pass for first data load ---
[GitHub] carbondata issue #1597: WIP test PR
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1597 retest this please ---
[GitHub] carbondata issue #1597: WIP test PR
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1597 retest this please ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1584 retest this please ---
[GitHub] carbondata issue #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1584 retest this please ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 Implementation
Github user SangeetaGulia commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1584#discussion_r153698343 --- Diff: core/src/main/java/org/apache/carbondata/core/locks/CarbonLockFactory.java --- @@ -52,23 +52,23 @@ */ public static ICarbonLock getCarbonLockObj(AbsoluteTableIdentifier absoluteTableIdentifier, String lockFile) { -switch (lockTypeConfigured) { - case CarbonCommonConstants.CARBON_LOCK_TYPE_LOCAL: --- End diff -- This change is to support multiple filesystems on a single carbonsession. So at a time if you work on S3 and local together, it should implicitly understand what lock type it should provide. So we are identifying lock type based on the type of file and the conditional match falls on two variables: one is lockTypeConfigured and second, we have to identify what type of file it is?(whether s3, hdfs, local) on the basis of which we will set lockTypeConfigured. Due to this change now we don't need to specify LOCK_TYPE in carbonProperties explicitly for s3, hdfs and local. ---
[GitHub] carbondata pull request #1584: [CARBONDATA-1827] Added S3 implementation and...
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1584 [CARBONDATA-1827] Added S3 implementation and TestCases Be sure to do all of the following checklist to help us incorporate your contribution quickly and easily: - [ ] Any interfaces changed? No - [ ] Any backward compatibility impacted? No - [ ] Document update required? No - [x] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? Added new Unit Test Cases. - How it is tested? Please attach test report. - Is it a performance related change? Please attach the performance test report. - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jatin9896/incubator-carbondata feature/s3Implementation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1584.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1584 commit bbc2dc340086a66bd9f30f0424a0fe5722ee4c57 Author: SangeetaGulia <sangeeta.gu...@knoldus.in> Date: 2017-09-21T09:26:26Z Added S3 implementation and TestCases ---
[GitHub] carbondata pull request #892: [CARBONDATA - 1036] - Added Implementation for...
Github user SangeetaGulia closed the pull request at: https://github.com/apache/carbondata/pull/892 ---
[GitHub] carbondata issue #1316: [CARBONDATA-1412] - Fixed bug related to incorrect b...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1316 @jackylk CARBON_TIMESTAMP_MILLIS is more precise in terms of time as compared to CARBON_TIMESTAMP_DEFAULT_FORMAT. Also, there were three formats defined for TIMESTAMP in CarbonCommonConstants 1) String CARBON_TIMESTAMP_DEFAULT_FORMAT = "-MM-dd HH:mm:ss"; (It has maximum occurences in the project currently) 2) String CARBON_TIMESTAMP = "dd-MM- HH:mm:ss"; 3) String CARBON_TIMESTAMP_MILLIS = "dd-MM- HH:mm:ss:SSS"; Do you want to use one format at all places? If yes, please suggest me one format from the three available formats. ---
[GitHub] carbondata issue #1316: [CARBONDATA-1412] - Fixed bug related to incorrect b...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1316 retest this please ---
[GitHub] carbondata issue #1316: [CARBONDATA-1412] - Fixed bug related to incorrect b...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/1316 retest this please. ---
[GitHub] carbondata pull request #1316: [CARBONDATA-1412] -
GitHub user SangeetaGulia opened a pull request: https://github.com/apache/carbondata/pull/1316 [CARBONDATA-1412] - Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[CARBONDATA-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - What manual testing you have done? - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/SangeetaGulia/incubator-carbondata CARBONDATA-1412 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/1316.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1316 commit e100cd7f3d784b66be2745f7f1cfcb857208c4aa Author: SangeetaGulia <sangeeta.gu...@knoldus.in> Date: 2017-09-04T07:04:54Z Fixed bug CARBONDATA-1412 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #892: [CARBONDATA - 1036] - Added Implementation for Flink ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/892 @chenliang613 Please have a look, I have made updations as suggested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #892: [CARBONDATA - 1036] - Added Implementation for Flink ...
Github user SangeetaGulia commented on the issue: https://github.com/apache/carbondata/pull/892 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---