[GitHub] incubator-carbondata pull request #815: [CARBONDATA-955] CacheProvider test ...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/815 [CARBONDATA-955] CacheProvider test fix You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata cache-test-fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/815.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #815 commit a49f751ac17c60e1750997b39d242c43b9ed98e3 Author: ravipesala Date: 2017-04-19T05:13:59Z CacheProvider test fix --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #813: [CARBONDATA-953] Added validations i...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/813 [CARBONDATA-953] Added validations in Unsafe dataload. Added validations to Unsafe dataload, now there are no validations of how much chunksize can be configured and how much working thread memory uses. Now it is configured max siz to 1GB. And also there is no control of adding data to sort threads so it may lead to out of memory.So this PR fixes it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata unsafe-validations Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/813.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #813 commit 080d7b4b21122af415802fcd2970bda8b6004935 Author: ravipesala Date: 2017-04-18T09:13:41Z Added validations in Unsafe dataload. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #811: [CARBONDATA-949] null value handled during ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/811 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #809: [WIP] Configured prefetch in query s...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/809 [WIP] Configured prefetch in query scanner You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata memory-issue-query Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/809.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #809 commit 9c2f7edd8b09fc37c6c2f506982914dde1e60069 Author: ravipesala Date: 2017-04-17T11:28:12Z Configured prefetch in query scanner --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #793: [CARBONDATA-909] Added option to specify si...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/793 Please raise a PR on master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #793: [CARBONDATA-909] Added option to specify si...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/793 LGTM, thanks for working on this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #749: [CARBONDATA-854] Datastax CFS file s...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/749#discussion_r111532440 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/filesystem/CFSCarbonFile.java --- @@ -0,0 +1,122 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.core.datastore.filesystem; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.datastore.impl.FileFactory; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; + + +public class CFSCarbonFile extends AbstractDFSCarbonFile { --- End diff -- Please add comments to the class --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #755: [CARBONDATA-881] Load status is successful ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/755 Please handle at SegmentStatusManager.updateDeletionStatus as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #730: [CARBONDATA-898]NullPointerException is get...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/730 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #790: [CARBONDATA-919]result_size in query...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/790#discussion_r111359840 --- Diff: integration/spark2/src/main/java/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java --- @@ -147,7 +148,11 @@ public VectorizedCarbonRecordReader(QueryModel queryModel) { } @Override public Object getCurrentValue() throws IOException, InterruptedException { -if (returnColumnarBatch) return columnarBatch; +if (returnColumnarBatch) { + rowCount += columnarBatch.numValidRows(); + return columnarBatch; +} +rowCount += 1; --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #790: [CARBONDATA-919]result_size in query statis...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/790 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #783: [CARBONDATA-903] data load is not failing e...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/783 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #789: [CARBONDATA-916] Major Compaction : Updatin...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/789 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #790: [CARBONDATA-919]result_size in query...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/790#discussion_r111353177 --- Diff: integration/spark2/src/main/java/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java --- @@ -147,7 +148,11 @@ public VectorizedCarbonRecordReader(QueryModel queryModel) { } @Override public Object getCurrentValue() throws IOException, InterruptedException { -if (returnColumnarBatch) return columnarBatch; +if (returnColumnarBatch) { + rowCount += columnarBatch.numValidRows(); + return columnarBatch; +} +rowCount += 1; --- End diff -- why incrementing with 1 again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #783: [CARBONDATA-903] data load is not fa...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/783#discussion_r31538 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/sort/AbstractMergeSorter.java --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.sort; + +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.sort.impl.ThreadStatusObserver; + +/** + * The class defines the common methods used in across various type of sort + */ +public abstract class AbstractMergeSorter { --- End diff -- implement it with sorter --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #774: [CARBONDATA-890] For Spark 2.1 LRU cache si...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/774 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #754: [CARBONDATA-880] Path should not get printe...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/754 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #783: [CARBONDATA-903] data load is not failing e...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/783 @mohammadshahidkhan Please create a Abstract class and move the common methods to there. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #755: [CARBONDATA-881] Load status is successful ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/755 @mohammadshahidkhan Please check all caller methods of `writeLoadDetailsIntoFile` and do exception handling there also, otherwise same issue happens if they catch the error and just log it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #755: [CARBONDATA-881] Load status is successful ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/755 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #778: [WIP] Added getAll dictionary to cod...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/778 [WIP] Added getAll dictionary to codegen of dictionary decoder. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata getAlldictionary-codegen Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/778.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #778 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #773: [CARBONDATA-893] Fixed testcase in h...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/773 [CARBONDATA-893] Fixed testcase in hadoop 2.7.2 version You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata hadoop-2.7.2-issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/773.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #773 commit bb95dc64344fe49169a4c2d6ec2011ea785b1580 Author: ravipesala Date: 2017-04-10T13:25:52Z FIxed testcase in hadoop 2.7.2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #772: [CARBONDATA-891] Fix compilation issue of L...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/772 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #752: [CARBONDATA-879] Clear driver side dict cac...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/752 @Sephiroth-Lin Yes you are right, but I feel it is not required to create dictionary while doing codegen, so better avoid dictionary creation while doing codegen. just create only at executor side. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #753: [CARBONDATA-877] Fixed ArrayIndexOutOfBound...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/753 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #752: [CARBONDATA-879] Clear driver side dict cac...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/752 @Sephiroth-Lin Actually clearing the cache at the time of codegen generation does not help as it just returns the code string to spark , Spark compiles it and executes as part of WholeCodeGen. We need to find a way to clear inside the code gen. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #727: [CARBONDATA-846] Added support to revert ch...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/727 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #724: [CARBONDATA-847] Fixed NullPointerException...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/724 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #703: [CARBONDATA-780] Alter table support for co...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/703 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #398: [CARBONDATA-400] Error message for dataload...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/398 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #717: [CARBONDATA-839] Fixed issue with meta lock...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/717 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #398: [CARBONDATA-400] Error message for d...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/398#discussion_r110117757 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala --- @@ -784,9 +784,28 @@ object GlobalDictionaryUtil { } } catch { case ex: Exception => -LOGGER.error(ex, "generate global dictionary failed") -throw ex +ex match { + case spx: SparkException => +LOGGER.error(spx, "generate global dictionary failed") +throw new Exception("generate global dictionary failed, " + +trimErrorMessage(spx.getMessage)) + case _ => +LOGGER.error(ex, "generate global dictionary failed") +throw ex +} +} + } + + // Get proper error message of TextParsingException + def trimErrorMessage(input: String): String = { +var errorMessage: String = null +if (input != null) { + if (input.split("Hint").length > 0 && + input.split("Hint")(0).split("TextParsingException: ").length > 1) { +errorMessage = input.split("Hint")(0).split("TextParsingException: ")(1) + } --- End diff -- Just simplify as follows. ``` val hintSplit = input.split("Hint") if (hintSplit.length > 0) { val parseSplit = hintSplit(0).split("TextParsingException: ") if (parseSplit.length > 1) { errorMessage = parseSplit(1) } } ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #694: [CARBONDATA-814] bad record log file writin...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/694 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #727: [CARBONDATA-846] Added support to re...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/727#discussion_r110112392 --- Diff: integration/spark2/src/main/scala/org/apache/spark/util/AlterTableUtil.scala --- @@ -96,4 +104,109 @@ object AlterTableUtil { } schemaParts.mkString(",") } + --- End diff -- Add comments on all methods --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #727: [CARBONDATA-846] Added support to re...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/727#discussion_r110108435 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommands.scala --- @@ -89,9 +89,8 @@ private[sql] case class AlterTableAddColumns( LOGGER.info(s"Alter table for add columns is successful for table $dbName.$tableName") LOGGER.audit(s"Alter table for add columns is successful for table $dbName.$tableName") } catch { - case e: Exception => -LOGGER.error(e, s"Alter table add columns failed : ${e.getMessage}") -// clean up the dictionary files in case of any failure + case runTimeException: RuntimeException => LOGGER --- End diff -- why catch only `RuntimeException` ? why not `Exception` or any specific exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #736: [CARBONDATA-871] If store type is HDFS then...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/736 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #742: [CARBONDATA-875] in create database DDL , D...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/742 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #739: [CARBONDATA-873] Drop table command throwin...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/739 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #703: [CARBONDATA-780] Alter table support for co...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/703 @manishgupta88 It is better to move all merger package to processing module. Spark module nothing to do with merging --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #703: [CARBONDATA-780] Alter table support...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/703#discussion_r110098777 --- Diff: integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/RowResultMerger.java --- @@ -57,15 +42,9 @@ /** * This is the Merger class responsible for the merging of the segments. */ -public class RowResultMerger { +public class RowResultMerger extends AbstractResultProcessor { --- End diff -- May be you can rename the class to `RowResultMergerProcessor` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #703: [CARBONDATA-780] Alter table support...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/703#discussion_r110098333 --- Diff: integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/CompactionResultSortProcessor.java --- @@ -0,0 +1,407 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.spark.merger; + +import java.io.File; +import java.io.IOException; +import java.util.Arrays; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure; +import org.apache.carbondata.core.scan.result.iterator.RawResultIterator; +import org.apache.carbondata.core.scan.wrappers.ByteArrayWrapper; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; +import org.apache.carbondata.processing.model.CarbonLoadModel; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.sortandgroupby.exception.CarbonSortKeyAndGroupByException; +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortDataRows; +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortIntermediateFileMerger; +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortParameters; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.SingleThreadFinalSortFilesMerger; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * This class will process the query result and convert the data + * into a format compatible for data load + */ +public class CompactionResultSortProcessor extends AbstractResultProcessor { + + /** + * LOGGER + */ + private static final LogService LOGGER = + LogServiceFactory.getLogService(CompactionResultSortProcessor.class.getName()); + /** + * carbon load model that contains all the required information for load + */ + private CarbonLoadModel carbonLoadModel; + /** + * carbon table + */ + private CarbonTable carbonTable; + /** + * sortDataRows instance for sorting each row read ad writing to sort temp file + */ + private SortDataRows sortDataRows; + /** + * final merger for merge sort + */ + private SingleThreadFinalSortFilesMerger finalMerger; + /** + * data handler VO object + */ + private CarbonFactHandler dataHandler; + /** + * segment properties for getting dimension cardinality and other required information of a block + */ + private SegmentProperties segmentProperties; + /** + * compaction type to decide whether taskID need to be extracted from carbondata files + */ + private CompactionType compactionType; + /** + * boolean mapping for no dictionary columns in schema + */ + private boolean[] noDictionaryColMapping; + /** + * agg type defined for measures + */ + private char[] aggType; + /** + * segment id + */ + private String segmentId; + /** + * temp store location to be sued during data load + */ + private Stri
[GitHub] incubator-carbondata pull request #703: [CARBONDATA-780] Alter table support...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/703#discussion_r110097631 --- Diff: integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/CompactionResultSortProcessor.java --- @@ -0,0 +1,407 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.spark.merger; + +import java.io.File; +import java.io.IOException; +import java.util.Arrays; +import java.util.List; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.encoder.Encoding; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure; +import org.apache.carbondata.core.scan.result.iterator.RawResultIterator; +import org.apache.carbondata.core.scan.wrappers.ByteArrayWrapper; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; +import org.apache.carbondata.processing.model.CarbonLoadModel; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.sortandgroupby.exception.CarbonSortKeyAndGroupByException; +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortDataRows; +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortIntermediateFileMerger; +import org.apache.carbondata.processing.sortandgroupby.sortdata.SortParameters; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.SingleThreadFinalSortFilesMerger; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * This class will process the query result and convert the data + * into a format compatible for data load + */ +public class CompactionResultSortProcessor extends AbstractResultProcessor { + + /** + * LOGGER + */ + private static final LogService LOGGER = + LogServiceFactory.getLogService(CompactionResultSortProcessor.class.getName()); + /** + * carbon load model that contains all the required information for load + */ + private CarbonLoadModel carbonLoadModel; + /** + * carbon table + */ + private CarbonTable carbonTable; + /** + * sortDataRows instance for sorting each row read ad writing to sort temp file + */ + private SortDataRows sortDataRows; + /** + * final merger for merge sort + */ + private SingleThreadFinalSortFilesMerger finalMerger; + /** + * data handler VO object + */ + private CarbonFactHandler dataHandler; + /** + * segment properties for getting dimension cardinality and other required information of a block + */ + private SegmentProperties segmentProperties; + /** + * compaction type to decide whether taskID need to be extracted from carbondata files + */ + private CompactionType compactionType; + /** + * boolean mapping for no dictionary columns in schema + */ + private boolean[] noDictionaryColMapping; + /** + * agg type defined for measures + */ + private char[] aggType; + /** + * segment id + */ + private String segmentId; + /** + * temp store location to be sued during data load + */ + private Stri
[GitHub] incubator-carbondata pull request #703: [CARBONDATA-780] Alter table support...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/703#discussion_r110096123 --- Diff: integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/CarbonCompactionUtil.java --- @@ -351,4 +351,33 @@ private static int getDimensionDefaultCardinality(CarbonDimension dimension) { } return cardinality; } + + /** + * This method will check for any restructured block in the blocks selected for compaction + * + * @param segmentMapping + * @param dataFileMetadataSegMapping + * @param tableLastUpdatedTime + * @return + */ + public static boolean checkIfAnyRestructuredBlockExists(Map segmentMapping, + Map> dataFileMetadataSegMapping, long tableLastUpdatedTime) { +boolean restructuredBlockExists = false; +for (Map.Entry taskMap : segmentMapping.entrySet()) { + String segmentId = taskMap.getKey(); + List listMetadata = dataFileMetadataSegMapping.get(segmentId); + for (DataFileFooter dataFileFooter : listMetadata) { +// if schema modified timestamp is greater than footer stored schema timestamp, --- End diff -- even for table rename also are we updating the schema timestamp ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #703: [CARBONDATA-780] Alter table support...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/703#discussion_r110093866 --- Diff: integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/AbstractResultProcessor.java --- @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.merger; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonMetadata; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure; +import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema; +import org.apache.carbondata.core.mutate.CarbonUpdateUtil; +import org.apache.carbondata.core.scan.result.iterator.RawResultIterator; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; +import org.apache.carbondata.core.util.path.CarbonStorePath; +import org.apache.carbondata.core.util.path.CarbonTablePath; +import org.apache.carbondata.processing.datatypes.GenericDataType; +import org.apache.carbondata.processing.model.CarbonLoadModel; +import org.apache.carbondata.processing.store.CarbonDataFileAttributes; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; + +/** + * This class contains the common methods required for result processing during compaction based on + * restructure and mormal scenarios + */ +public abstract class AbstractResultProcessor { + + /** + * This method will perform the desired tasks of merging the selected slices + * + * @param resultIteratorList + * @return + */ + public abstract boolean execute(List resultIteratorList); + + /** + * This method will create a model object for carbon fact data handler + * + * @param loadModel + * @return + */ --- End diff -- Move this method to CarbonFactDataHandlerModel class --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #703: [CARBONDATA-780] Alter table support...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/703#discussion_r110093494 --- Diff: integration/spark-common/src/main/java/org/apache/carbondata/spark/merger/AbstractResultProcessor.java --- @@ -0,0 +1,162 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.spark.merger; + +import java.util.Arrays; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonMetadata; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.metadata.schema.table.CarbonTable; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension; +import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure; +import org.apache.carbondata.core.metadata.schema.table.column.ColumnSchema; +import org.apache.carbondata.core.mutate.CarbonUpdateUtil; +import org.apache.carbondata.core.scan.result.iterator.RawResultIterator; +import org.apache.carbondata.core.util.CarbonUtil; +import org.apache.carbondata.core.util.DataTypeUtil; +import org.apache.carbondata.core.util.path.CarbonStorePath; +import org.apache.carbondata.core.util.path.CarbonTablePath; +import org.apache.carbondata.processing.datatypes.GenericDataType; +import org.apache.carbondata.processing.model.CarbonLoadModel; +import org.apache.carbondata.processing.store.CarbonDataFileAttributes; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; + +/** + * This class contains the common methods required for result processing during compaction based on + * restructure and mormal scenarios --- End diff -- typo `mormal` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #709: [CARBONDATA-861] Improvements in que...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/709#discussion_r110081787 --- Diff: examples/spark2/src/main/scala/org/apache/carbondata/examples/CompareTest.scala --- @@ -306,6 +307,8 @@ object CompareTest { // do GC and sleep for some time before running next table System.gc() Thread.sleep(1000) +System.gc() --- End diff -- There is no guarntee that GC will be called after calling of System.gc(), thats why after waiting for 1 second called again to increase the probability of running GC --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #709: [CARBONDATA-861] Improvements in que...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/709#discussion_r110081790 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDataFrameWriter.scala --- @@ -171,11 +171,20 @@ class CarbonDataFrameWriter(sqlContext: SQLContext, val dataFrame: DataFrame) { } ).append( if (options.dictionaryExclude.isDefined) { -s"'DICTIONARY_EXCLUDE' = '${options.dictionaryExclude.get}'" +s"'DICTIONARY_EXCLUDE' = '${options.dictionaryExclude.get}' ," + } else { +"" + } +).append( + if (options.tableBlockSize.isDefined) { +s"'table_blocksize' = '${options.tableBlockSize.get}'" } else { "" } ) +if (property.nonEmpty && property.charAt(property.length-1) == ',') { + property = property.replace(property.length-1, property.length, "") --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #717: [CARBONDATA-839] Fixed issue with me...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/717#discussion_r110081640 --- Diff: core/src/main/java/org/apache/carbondata/core/locks/LocalFileLock.java --- @@ -70,7 +71,8 @@ LogServiceFactory.getLogService(LocalFileLock.class.getName()); static { -tmpPath = System.getProperty("java.io.tmpdir"); +tmpPath = CarbonProperties.getInstance().getProperty(CarbonCommonConstants.STORE_LOCATION, +System.getProperty("java.io.tmpdir")); --- End diff -- Still that code is not yet merged. it is present in https://github.com/apache/incubator-carbondata/pull/736 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #724: [CARBONDATA-847] Fixed NullPointerEx...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/724#discussion_r110081351 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DictionaryBasedVectorResultCollector.java --- @@ -51,8 +51,8 @@ public DictionaryBasedVectorResultCollector(BlockExecutionInfo blockExecutionInfos) { super(blockExecutionInfos); -queryDimensions = tableBlockExecutionInfos.getQueryDimensions(); -queryMeasures = tableBlockExecutionInfos.getQueryMeasures(); +queryDimensions = tableBlockExecutionInfos.getActualQueryDimensions(); --- End diff -- I did not get, if any restructuring is required then it supposed to go to `RestructureBasedVectorResultCollector` class right. why it comes here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #733: [CARBONDATA-863] Support creation and delet...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/733 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #735: [CARBONDATA-870] Folders and files not gett...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/735 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #723: [CARBONDATA-845] Insert Select Into Same Ta...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/723 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #398: [CARBONDATA-400] Error message for d...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/398#discussion_r110078344 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala --- @@ -784,9 +784,28 @@ object GlobalDictionaryUtil { } } catch { case ex: Exception => -LOGGER.error(ex, "generate global dictionary failed") -throw ex +ex match { + case spx: SparkException => +LOGGER.error(spx, "generate global dictionary failed") +throw new Exception("generate global dictionary failed" + +trimErrorMessage(spx.getMessage)) + case _ => +LOGGER.error(ex, "generate global dictionary failed") +throw ex +} +} + } + + // Get proper error message of TextParsingException + def trimErrorMessage(input: String): String = { +var errorMessage: String = "" +if (input != null) { + if (input.split("Hint").length > 0) { +errorMessage = input.split("Hint")(0).split("TextParsingException: ")(1) --- End diff -- even for `split("TextParsingException: ")` also check the length more than 1. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #723: [CARBONDATA-845] Insert Select Into Same Ta...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/723 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #723: [CARBONDATA-845] Insert Select Into ...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/723#discussion_r109964888 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/DDLStrategy.scala --- @@ -54,6 +54,9 @@ class DDLStrategy(sparkSession: SparkSession) extends SparkStrategy { identifier.table.toLowerCase)) :: Nil case ShowLoadsCommand(databaseName, table, limit) => ExecutedCommandExec(ShowLoads(databaseName, table.toLowerCase, limit, plan.output)) :: Nil + case InsertIntoCarbonTable(relation: CarbonDatasourceHadoopRelation, + _, child: LogicalPlan, _, _) => --- End diff -- Please format it properly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #398: [CARBONDATA-400] Error message for d...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/398#discussion_r109963768 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala --- @@ -784,11 +784,24 @@ object GlobalDictionaryUtil { } } catch { case ex: Exception => -LOGGER.error(ex, "generate global dictionary failed") -throw ex +ex match { + case spx: SparkException => +LOGGER.error(spx, "generate global dictionary failed") +throw new Exception( + trimErrorMessage(spx.getMessage)) + case _ => +LOGGER.error(ex, "generate global dictionary failed") +throw ex +} } } + // Get proper error message of TextParsingException + def trimErrorMessage(input: String): String = { +val message = input.split("Hint")(0).split("TextParsingException: ")(1) --- End diff -- There may be chances of input is null because `spsx.getMessage()` can be null, so handle NPE here. And also please check the array length after you split with `"Hint"`, other wise we may get ArrayIndexOutOfBoundException. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #702: [CARBONDATA-823] Refactory data write step
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/702 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #724: [CARBONDATA-847] Fixed NullPointerEx...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/724#discussion_r109851124 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DictionaryBasedVectorResultCollector.java --- @@ -51,8 +51,8 @@ public DictionaryBasedVectorResultCollector(BlockExecutionInfo blockExecutionInfos) { super(blockExecutionInfos); -queryDimensions = tableBlockExecutionInfos.getQueryDimensions(); -queryMeasures = tableBlockExecutionInfos.getQueryMeasures(); +queryDimensions = tableBlockExecutionInfos.getActualQueryDimensions(); --- End diff -- Why getting `actualQueryDimensions` instead of `queryDimensions` in `DictionaryBasedVectorResultCollector`? But we get only `queryDimensions` in `DictionaryBasedResultCollector` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #717: [CARBONDATA-839] Fixed issue with me...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/717#discussion_r109850158 --- Diff: core/src/main/java/org/apache/carbondata/core/locks/ICarbonLock.java --- @@ -36,4 +36,6 @@ */ boolean lockWithRetries(); + boolean releaseLockManually(String lockFile); --- End diff -- Add comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #717: [CARBONDATA-839] Fixed issue with me...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/717#discussion_r109850077 --- Diff: core/src/main/java/org/apache/carbondata/core/locks/LocalFileLock.java --- @@ -70,7 +71,8 @@ LogServiceFactory.getLogService(LocalFileLock.class.getName()); static { -tmpPath = System.getProperty("java.io.tmpdir"); +tmpPath = CarbonProperties.getInstance().getProperty(CarbonCommonConstants.STORE_LOCATION, +System.getProperty("java.io.tmpdir")); --- End diff -- what if my store location is hdfs? how it can create lock there? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #725: [CARBONDATA-849] Correcting the error messa...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/725 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #729: [CARBONDATA-850] Fix the comment definition...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/729 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #723: [CARBONDATA-845] Insert Select Into Same Ta...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/723 Can't we have same restriction like Spark, is it really required to override this behavior? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #725: Issue : https://issues.apache.org/jira/brow...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/725 @ravikiran23 Please update the title properly as per guideline --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #726: [WIP] Fixed order by limit with select * qu...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/726 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #713: [CARBONDATA-838]Add decimal column without ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/713 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #729: [CARBONDATA-850] Fix the comment def...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/729#discussion_r109631607 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -1,223 +1,226 @@ -/** - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + *http://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. */ /** - * File format description for the carbon file format + * File format description for the CarbonData file format */ namespace java org.apache.carbondata.format include "schema.thrift" include "dictionary.thrift" /** -* Information about a segment, that represents one data load -*/ + * Information about a segment, that represents one data load + */ struct SegmentInfo{ -1: required i32 num_cols; // Number of columns in this load, because schema can evolve . TODO: Check whether this is really required +1: required i32 num_cols; // Number of columns in this load, because schema can evolve, different segments may have different columns 2: required list column_cardinalities; // Cardinality of columns } /** -* Btree index of one blocklet -*/ + * Btree index of one blocklet + */ struct BlockletBTreeIndex{ 1: required binary start_key; // Bit-packed start key of one blocklet -2: required binary end_key;// Bit-packed start key of one blocklet +2: required binary end_key;// Bit-packed end key of one blocklet } /** -* Min-max index of one blocklet -*/ + * Min-max index of one blocklet + */ struct BlockletMinMaxIndex{ 1: required list min_values; //Min value of all columns of one blocklet Bit-Packed 2: required list max_values; //Max value of all columns of one blocklet Bit-Packed } /** -* Index of one blocklet -**/ + * Index of one blocklet + */ struct BlockletIndex{ 1: optional BlockletMinMaxIndex min_max_index; 2: optional BlockletBTreeIndex b_tree_index; } /** -* Sort state of one column -*/ + * Sort state of one column + */ enum SortState{ SORT_NONE = 0; // Data is not sorted SORT_NATIVE = 1; //Source data was sorted SORT_EXPLICIT = 2; // Sorted (ascending) when loading } /** -* Compressions supported by Carbon Data. -*/ + * Compressions supported by CarbonData. + */ enum CompressionCodec{ SNAPPY = 0; } /** -* Represents the data of one dimension one dimension group in one blocklet -*/ -// add a innger level placeholder for further I/O granulatity + * Represents the data of one column page or one column page group in one blocklet. --- End diff -- And also we can mention that we are currently not using it, we keep it as for future extend ability. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #729: [CARBONDATA-850] Fix the comment def...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/729#discussion_r109631243 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -1,223 +1,226 @@ -/** - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + *http://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. */ /** - * File format description for the carbon file format + * File format description for the CarbonData file format */ namespace java org.apache.carbondata.format include "schema.thrift" include "dictionary.thrift" /** -* Information about a segment, that represents one data load -*/ + * Information about a segment, that represents one data load + */ struct SegmentInfo{ -1: required i32 num_cols; // Number of columns in this load, because schema can evolve . TODO: Check whether this is really required +1: required i32 num_cols; // Number of columns in this load, because schema can evolve, different segments may have different columns 2: required list column_cardinalities; // Cardinality of columns } /** -* Btree index of one blocklet -*/ + * Btree index of one blocklet + */ struct BlockletBTreeIndex{ 1: required binary start_key; // Bit-packed start key of one blocklet -2: required binary end_key;// Bit-packed start key of one blocklet +2: required binary end_key;// Bit-packed end key of one blocklet } /** -* Min-max index of one blocklet -*/ + * Min-max index of one blocklet + */ struct BlockletMinMaxIndex{ 1: required list min_values; //Min value of all columns of one blocklet Bit-Packed 2: required list max_values; //Max value of all columns of one blocklet Bit-Packed } /** -* Index of one blocklet -**/ + * Index of one blocklet + */ struct BlockletIndex{ 1: optional BlockletMinMaxIndex min_max_index; 2: optional BlockletBTreeIndex b_tree_index; } /** -* Sort state of one column -*/ + * Sort state of one column + */ enum SortState{ SORT_NONE = 0; // Data is not sorted SORT_NATIVE = 1; //Source data was sorted SORT_EXPLICIT = 2; // Sorted (ascending) when loading } /** -* Compressions supported by Carbon Data. -*/ + * Compressions supported by CarbonData. + */ enum CompressionCodec{ SNAPPY = 0; } /** -* Represents the data of one dimension one dimension group in one blocklet -*/ -// add a innger level placeholder for further I/O granulatity + * Represents the data of one column page or one column page group in one blocklet. --- End diff -- Better phrase as follow. `Represents the data of one column page or one column page group in side blocklet.` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #729: [CARBONDATA-850] Fix the comment def...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/729#discussion_r109630499 --- Diff: format/src/main/thrift/carbondata_index.thrift --- @@ -1,46 +1,44 @@ -/** - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + *http://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. */ /** - * File format description for the carbon file format + * File format description for CarbonData index file, one node one carbonindex file for building driver side's index tree. --- End diff -- I guess it is not required to mention that `one node one carbonindex file for building driver side's index tree` as it can change depends up on loading algorithm. In batch sort it creates one index file for one batch. And in the same way for bucket loading it creates one index file for each bucket and node. We can just say that one index file for btree. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #729: [CARBONDATA-850] Fix the comment def...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/729#discussion_r109628000 --- Diff: format/src/main/thrift/carbondata.thrift --- @@ -1,223 +1,226 @@ -/** - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at * - * http://www.apache.org/licenses/LICENSE-2.0 + *http://www.apache.org/licenses/LICENSE-2.0 * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. */ /** - * File format description for the carbon file format + * File format description for the CarbonData file format */ namespace java org.apache.carbondata.format include "schema.thrift" include "dictionary.thrift" /** -* Information about a segment, that represents one data load -*/ + * Information about a segment, that represents one data load + */ struct SegmentInfo{ -1: required i32 num_cols; // Number of columns in this load, because schema can evolve . TODO: Check whether this is really required +1: required i32 num_cols; // Number of columns in this load, because schema can evolve, different segments may have different columns 2: required list column_cardinalities; // Cardinality of columns } /** -* Btree index of one blocklet -*/ + * Btree index of one blocklet + */ struct BlockletBTreeIndex{ 1: required binary start_key; // Bit-packed start key of one blocklet -2: required binary end_key;// Bit-packed start key of one blocklet +2: required binary end_key;// Bit-packed end key of one blocklet } /** -* Min-max index of one blocklet -*/ + * Min-max index of one blocklet + */ struct BlockletMinMaxIndex{ 1: required list min_values; //Min value of all columns of one blocklet Bit-Packed 2: required list max_values; //Max value of all columns of one blocklet Bit-Packed } /** -* Index of one blocklet -**/ + * Index of one blocklet + */ struct BlockletIndex{ 1: optional BlockletMinMaxIndex min_max_index; 2: optional BlockletBTreeIndex b_tree_index; } /** -* Sort state of one column -*/ + * Sort state of one column + */ enum SortState{ SORT_NONE = 0; // Data is not sorted SORT_NATIVE = 1; //Source data was sorted SORT_EXPLICIT = 2; // Sorted (ascending) when loading } /** -* Compressions supported by Carbon Data. -*/ + * Compressions supported by CarbonData. + */ enum CompressionCodec{ SNAPPY = 0; } /** -* Represents the data of one dimension one dimension group in one blocklet -*/ -// add a innger level placeholder for further I/O granulatity + * Represents the data of one column page or one column page group in one blocklet. + * Currently CarbonData using snappy for compression directly, user can specify the different compression by ChunkCompressionMeta. + */ struct ChunkCompressionMeta{ -1: required CompressionCodec compression_codec; // the compressor used -/** total byte size of all uncompressed pages in this column chunk (including the headers) **/ +1: required CompressionCodec compression_codec; // The compressor us
[GitHub] incubator-carbondata pull request #726: [WIP] Fixed order by limit with sele...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/726 [WIP] Fixed order by limit with select * query Order by limit with select * from query is not working as the it uses `TakeOrderedAndProjectExec` internally and it try to get the output from logical plan instead of physical plan. So incase of logical releation we need to update the dictionary attributes with integer type. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata sort-by-imit-issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/726.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #726 commit 29777617b5553943432d3064b168d2867627d6af Author: ravipesala Date: 2017-04-04T09:48:46Z Fixed order by limit with select * query --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #709: [WIP] Improvements in query
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/709 [WIP] Improvements in query Following are the list of improvements done in this part of PR. 1. Removed multiple creation of array and copy of it in Dimension and measure chunk readers. 2. Simplified logic of finding offsets of nodictionary keys in the class SafeVariableLengthDimensionDataChunkStore. 3. Avoided byte array creation and copy for nodictionary columns in case of vectorized reader. Instead directly sending the length and offset to vector. 4. Removed unnecessary decoder plan additions to oprtimized plan. It can optimize the codegen flow. 5. Updated CompareTest to take table blocksize and kept as 32 Mb in order to make use of small sorting when doing take ordered in spark. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata minor-perf-improv Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/709.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #709 commit eaa964425ffa784d905b12045e6e719c55eb1164 Author: ravipesala Date: 2017-03-05T15:02:35Z Removed unnecessary array copy and bitset checking commit 62914d866063a2606f6396b9912cf4466cbacef9 Author: ravipesala Date: 2017-03-28T15:24:26Z OPtimized code commit 45a4dcab42842f61d7cf28c5834bb4810c77bcbc Author: ravipesala Date: 2017-03-29T13:57:11Z Added table_blocksize option. commit 57d135937843ce89eb3805cddad6034cf9db3aaf Author: ravipesala Date: 2017-03-29T18:49:36Z Removed unnecessary plan from optimized plan. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #699: [CARBONDATA-821] Removed kettle rela...
Github user ravipesala closed the pull request at: https://github.com/apache/incubator-carbondata/pull/699 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #699: [CARBONDATA-821] Removed kettle rela...
GitHub user ravipesala reopened a pull request: https://github.com/apache/incubator-carbondata/pull/699 [CARBONDATA-821] Removed kettle related code and refactored Removed all the kettle related code and refactored code. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata remove-kettle Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/699.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #699 commit 8bc51e7c2c565ce12e635f884b0b24db655ac92d Author: ravipesala Date: 2017-03-26T10:40:47Z Removed kettle related code and refactored commit 8ff2f3be913a2cd43ded7304c9497423f9dd0cda Author: ravipesala Date: 2017-03-26T11:30:32Z Removed carbonplugins --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #701: [CARBONDATA-822] Added unsafe sort f...
GitHub user ravipesala reopened a pull request: https://github.com/apache/incubator-carbondata/pull/701 [CARBONDATA-822] Added unsafe sort for bucketing feature Added unsafe sorter for bucketing the table to improve loading performance. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata unsafe-bucketing Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/701.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #701 commit 9235ee24e5b5909d3d034e4c856b848988eb34ed Author: ravipesala Date: 2017-03-26T13:42:24Z Added unsafe sort for bucketing feature --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #701: [CARBONDATA-822] Added unsafe sort f...
Github user ravipesala closed the pull request at: https://github.com/apache/incubator-carbondata/pull/701 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #699: [CARBONDATA-821] Removed kettle related cod...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/699 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #701: [CARBONDATA-822] Added unsafe sort f...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/701 [CARBONDATA-822] Added unsafe sort for bucketing feature Added unsafe sorter for bucketing the table to improve loading performance. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata unsafe-bucketing Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/701.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #701 commit 9235ee24e5b5909d3d034e4c856b848988eb34ed Author: ravipesala Date: 2017-03-26T13:42:24Z Added unsafe sort for bucketing feature --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #689: [CARBONDATA-812] make vectorized reader as ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/689 LGTM. Testcases failing in this PR is fixed in PR https://github.com/apache/incubator-carbondata/pull/700 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #700: [CARBONDATA-812] Fixed colgrp testca...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/700 [CARBONDATA-812] Fixed colgrp testcases for vectorized set to true Fixed colgrp testcases when vectorized set to true You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata fix-test-vector-reader Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/700.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #700 commit ef6a6f8e5ccd59c17958aa33fa66c4a2a31ba5c8 Author: ravipesala Date: 2017-03-26T11:20:17Z Fixed colgrp testcases for vectorized set to true --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #699: [CARBONDATA-821] Removed kettle rela...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/699 [CARBONDATA-821] Removed kettle related code and refactored Removed all the kettle related code and refactored code. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata remove-kettle Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/699.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #699 commit 8bc51e7c2c565ce12e635f884b0b24db655ac92d Author: ravipesala Date: 2017-03-26T10:40:47Z Removed kettle related code and refactored --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #697: [CARBONDATA-708] Fixed Between and Logical ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/697 LGTM, except few style issues raised by david, please fix them --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/696 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #693: [CARBONDATA-813] Fix all issues to build su...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/693 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #689: [CARBONDATA-812] make vectorized reader as ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/689 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #661: [CARBONDATA-761] remove shutdown dictionary...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/661 @lionelcao There is another PR https://github.com/apache/incubator-carbondata/pull/656 working on the same cause. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #690: [CARBONDATA-811] Refactor dictionary based ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/690 LGTM. Testcases are failing not because of this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #692: [CARBONDATA-803] Fixing Spark 1.6 te...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/692 [CARBONDATA-803] Fixing Spark 1.6 testcases. Or filters are not working in spark 1.6. Currently few testcases are failing in spark-1.6 because of OR filters are partially pushed down in spark 1.6 so i returns wrong results. Now this PR makes either completely push down the OR filters or completely handled by spark layer. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata or-filter-issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/692.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #692 commit 8679eb33a56821de109a967c426ef04856e3e2ab Author: ravipesala Date: 2017-03-23T16:47:48Z Or filters are not working in spark 1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #691: [CARBONDATA-783] Fixed message fails...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/691 [CARBONDATA-783]Fixed message fails with outofbound exception in dictionary server This error happens because we are using TCP protocol and it may not sends the whole data in single packet, so if the data is big it is sending in multiple packets but we are taking only first packet to process the data, that is why it is failing. Now added the length decoder and added length of whole message at header, so length decoder receives all packets depending on the length provided at header of message. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata single-pass-optim Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/691.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #691 commit 2fb0be27c2445f9de13ff77c70b80554b60d9492 Author: ravipesala Date: 2017-03-23T08:34:07Z Add single pass perf test commit 757313dcf41b0068a377e68a37928f9210e2fd3d Author: ravipesala Date: 2017-03-23T13:57:23Z Fixed message fails with outofbound exception in dictionary server --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #688: [CARBONDATA-811] Refactor result collector ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/688 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #686: [CARBONDATA-805] Fix groupid,package name,C...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/686 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #685: [CARBONDATA-809] Fixed union with al...
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/685 [CARBONDATA-809] Fixed union with alias error When using alias in union queries, it uses same instance of alias across the query, so one query alias is getting overridden with other table alias thats why it return wrong data. Now this PR passes exclusive alias map for each union query. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata alias-issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/685.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #685 commit 4c8aa969f814730aa5954764a30ef638879f9445 Author: ravipesala Date: 2017-03-23T06:27:11Z Fixed union with alias error --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #684: [CARBONDATA-804] Update file structure info...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/684 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #683: [CARBONDATA-803] Incorrect results returned...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/683 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #635: [CARBONDATA-782]support SORT_COLUMNS
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/635 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #635: [CARBONDATA-782]support SORT_COLUMNS
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/635 I feel, we should not use the old flow. If user don't specify sort_columns then we should add all dimensions to sort_columns. In this way in our core/processing always depends upon sort_columns, so that code would be clear. We just always depends on the sort_columns in our core code, please refactor the code in that way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #677: [WIP] Added lock log
GitHub user ravipesala opened a pull request: https://github.com/apache/incubator-carbondata/pull/677 [WIP] Added lock log Adding lock log to find the issue of random failing of testcases in CI You can merge this pull request into a Git repository by running: $ git pull https://github.com/ravipesala/incubator-carbondata table-lock-issue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/677.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #677 commit b2dee00aac36761ef4cc5b3e8f043be115d41252 Author: ravipesala Date: 2017-03-21T04:31:07Z Added lock log --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #635: [CARBONDATA-782]support SORT_COLUMNS
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/635#discussion_r106908141 --- Diff: processing/src/main/java/org/apache/carbondata/processing/sortandgroupby/sortdata/SortDataRows.java --- @@ -193,20 +193,42 @@ public void startSorting() throws CarbonSortKeyAndGroupByException { toSort = new Object[entryCount][]; System.arraycopy(recordHolderList, 0, toSort, 0, entryCount); - if (parameters.isUseKettle()) { -if (parameters.getNoDictionaryCount() > 0) { - Arrays.sort(toSort, new RowComparator(parameters.getNoDictionaryDimnesionColumn(), - parameters.getNoDictionaryCount())); + if (parameters.getNumberOfSortColumns() == 0) { --- End diff -- I feel, we should not use the old flow. If user don't specify sort_columns then we should add all dimensions to sort_columns. In this way in our core/processing always depends upon sort_columns, so that code would be clear. We just always depends on the sort_columns in our core code, please refactor the code in that way. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #672: [CARBONDATA-727][WIP] add hive integration ...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/672 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #635: [CARBONDATA-782]support SORT_COLUMNS
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/635#discussion_r106803693 --- Diff: processing/src/main/java/org/apache/carbondata/processing/sortandgroupby/sortdata/SortDataRows.java --- @@ -460,22 +482,46 @@ public DataSorterAndWriter(Object[][] recordHolderArray) { @Override public Void call() throws Exception { try { long startTime = System.currentTimeMillis(); -if (parameters.isUseKettle()) { - if (parameters.getNoDictionaryCount() > 0) { -Arrays.sort(recordHolderArray, -new RowComparator(parameters.getNoDictionaryDimnesionColumn(), -parameters.getNoDictionaryCount())); +if (parameters.getNumberOfSortColumns() == 0) { --- End diff -- You can move this to one common method. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---