[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/1435 ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r151665636 --- Diff: integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala --- @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.sql.test.util.QueryTest +import org.apache.carbondata.core.constants.CarbonCommonConstants +import org.scalatest.BeforeAndAfterAll + +class GetDataSizeAndIndexSizeTest extends QueryTest with BeforeAndAfterAll { --- End diff -- Please add testcase of update scenerio ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r151458293 --- Diff: integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala --- @@ -292,6 +290,35 @@ object CarbonDataRDDFactory { var executorMessage: String = "" val isSortTable = carbonTable.getNumberOfSortColumns > 0 val sortScope = CarbonDataProcessorUtil.getSortScope(carbonLoadModel.getSortScope) + +def updateStatus(status: Array[(String, (LoadMetadataDetails, ExecutionErrors))], --- End diff -- do not update table status file separately in separate method for size, add size while adding loadmetadata details to table status ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r151446545 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -2119,5 +2127,146 @@ public static String getNewTablePath(Path carbonTablePath, return parentPath.toString() + CarbonCommonConstants.FILE_SEPARATOR + carbonTableIdentifier .getTableName(); } + + /* + * This method will add data size and index size into tablestatus for each segment + */ + public static void addDataIndexSizeIntoMetaEntry(LoadMetadataDetails loadMetadataDetails, + String segmentId, CarbonTable carbonTable) throws IOException { +CarbonTablePath carbonTablePath = + CarbonStorePath.getCarbonTablePath((carbonTable.getAbsoluteTableIdentifier())); +HashMapdataIndexSize = +FileFactory.getDataSizeAndIndexSize(carbonTablePath, segmentId); +loadMetadataDetails + .setDataSize(dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_DATA_SIZE).toString()); +loadMetadataDetails + .setIndexSize(dataIndexSize.get(CarbonCommonConstants.CARBON_TOTAL_INDEX_SIZE).toString()); + } + + /** + * This method will calculate the data size and index size for carbon table + */ + public static HashMap calculateSize(CarbonTable carbonTable) --- End diff -- Update the method signature Map ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r151446750 --- Diff: core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java --- @@ -2119,5 +2127,146 @@ public static String getNewTablePath(Path carbonTablePath, return parentPath.toString() + CarbonCommonConstants.FILE_SEPARATOR + carbonTableIdentifier .getTableName(); } + + /* + * This method will add data size and index size into tablestatus for each segment + */ + public static void addDataIndexSizeIntoMetaEntry(LoadMetadataDetails loadMetadataDetails, + String segmentId, CarbonTable carbonTable) throws IOException { +CarbonTablePath carbonTablePath = + CarbonStorePath.getCarbonTablePath((carbonTable.getAbsoluteTableIdentifier())); +HashMapdataIndexSize = --- End diff -- Change it to Map ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user akashrn5 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r151421484 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -606,4 +609,53 @@ public static FileSystem getFileSystem(Path path) throws IOException { return path.getFileSystem(configuration); } + // Get the total size of carbon data and the total size of carbon index + public static HashMapgetDataSizeAndIndexSize(CarbonTablePath carbonTablePath, --- End diff -- handled comments ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r150857977 --- Diff: core/src/main/java/org/apache/carbondata/core/memory/HeapMemoryAllocator.java --- @@ -17,11 +17,11 @@ package org.apache.carbondata.core.memory; -import javax.annotation.concurrent.GuardedBy; import java.lang.ref.WeakReference; import java.util.HashMap; import java.util.LinkedList; import java.util.Map; +import javax.annotation.concurrent.GuardedBy; --- End diff -- why is this added? ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r150857833 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -606,4 +609,53 @@ public static FileSystem getFileSystem(Path path) throws IOException { return path.getFileSystem(configuration); } + // Get the total size of carbon data and the total size of carbon index + public static HashMapgetDataSizeAndIndexSize(CarbonTablePath carbonTablePath, --- End diff -- This method cannot be in FileFactory, as it in not generic filesystem operation. ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r150856173 --- Diff: core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java --- @@ -509,7 +512,7 @@ public static boolean createNewLockFile(String filePath, FileType fileType) thro } public enum FileType { -LOCAL, HDFS, ALLUXIO, VIEWFS +LOCAL, HDFS, ALLUXIO, VIEWFS, S3 --- End diff -- S3 is not supported yet. ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user akashrn5 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r148957017 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1376,6 +1376,32 @@ public static final String BITSET_PIPE_LINE_DEFAULT = "true"; + /** + * The total size of carbon data + */ + public static final String CARBON_TOTAL_DATA_SIZE = "datasize"; + + /** + * The total size of carbon index + */ + public static final String CARBON_TOTAL_INDEX_SIZE = "indexsize"; + + /** + * ENABLE_CALCULATE_DATA_INDEX_SIZE + */ + @CarbonProperty public static final String ENABLE_CALCULATE_SIZE = "carbon.enable.calculate.size"; + + /** + * DEFAULT_ENABLE_CALCULATE_DATA_INDEX_SIZE + */ + @CarbonProperty public static final String DEFAULT_ENABLE_CALCULATE_SIZE = "true"; --- End diff -- ok ---
[GitHub] carbondata pull request #1435: [CARBONDATA-1626]add data size and index size...
Github user mohammadshahidkhan commented on a diff in the pull request: https://github.com/apache/carbondata/pull/1435#discussion_r147471059 --- Diff: core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java --- @@ -1376,6 +1376,32 @@ public static final String BITSET_PIPE_LINE_DEFAULT = "true"; + /** + * The total size of carbon data + */ + public static final String CARBON_TOTAL_DATA_SIZE = "datasize"; + + /** + * The total size of carbon index + */ + public static final String CARBON_TOTAL_INDEX_SIZE = "indexsize"; + + /** + * ENABLE_CALCULATE_DATA_INDEX_SIZE + */ + @CarbonProperty public static final String ENABLE_CALCULATE_SIZE = "carbon.enable.calculate.size"; + + /** + * DEFAULT_ENABLE_CALCULATE_DATA_INDEX_SIZE + */ + @CarbonProperty public static final String DEFAULT_ENABLE_CALCULATE_SIZE = "true"; --- End diff -- for constant variable/keys CarbonProperty anotation not required ---