[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532396937



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);
+  }
+  // delete the segment file as well
+  
FileFactory.deleteFile(CarbonTablePath.getSegmentFilePath(carbonTable.getTablePath(),

Review comment:
   added in the same try- catch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532396566



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);
+  }
+  // delete the segment file as well
+  
FileFactory.deleteFile(CarbonTablePath.getSegmentFilePath(carbonTable.getTablePath(),
+  staleSegment));
+}
+  }
+  staleSegments.clear();
+}
+  }
+
+  /**
+   * This method will clean all the stale segments for partition table, delete 
the source folders
+   * after copying the data to the trash and also remove the .segment files of 
the stale segments
+   */
+  public static void cleanStaleSegmentsForPartitionTable(CarbonTable 
carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532396388



##
File path: core/src/main/java/org/apache/carbondata/core/util/TrashUtil.java
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.io.IOUtils;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the trash folder in carbondata. This class has methods to copy 
data to the trash and
+ * remove data from the trash.
+ */
+public final class TrashUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(TrashUtil.class.getName());
+
+  /**
+   * Base method to copy the data to the trash folder.
+   *
+   * @param sourcePath the path from which to copy the file
+   * @param destinationPath  the path where the file will be copied
+   * @return
+   */
+  private static void copyToTrashFolder(String sourcePath, String 
destinationPath)
+  throws IOException {
+DataOutputStream dataOutputStream = null;
+DataInputStream dataInputStream = null;
+try {
+  dataOutputStream = FileFactory.getDataOutputStream(destinationPath);
+  dataInputStream = FileFactory.getDataInputStream(sourcePath);
+  IOUtils.copyBytes(dataInputStream, dataOutputStream, 
CarbonCommonConstants.BYTEBUFFER_SIZE);
+} catch (IOException exception) {
+  LOGGER.error("Unable to copy " + sourcePath + " to the trash folder", 
exception);
+  throw exception;
+} finally {
+  CarbonUtil.closeStreams(dataInputStream, dataOutputStream);
+}
+  }
+
+  /**
+   * The below method copies the complete a file to the trash folder.
+   *
+   * @param filePathToCopy the files which are to be moved to the trash folder
+   * @param trashFolderWithTimestamptimestamp, partition folder(if any) 
and segment number
+   * @return
+   */
+  public static void copyFileToTrashFolder(String filePathToCopy,
+  String trashFolderWithTimestamp) throws IOException {
+CarbonFile carbonFileToCopy = FileFactory.getCarbonFile(filePathToCopy);
+try {
+  if (carbonFileToCopy.exists()) {
+if (!FileFactory.isFileExist(trashFolderWithTimestamp)) {
+  FileFactory.mkdirs(trashFolderWithTimestamp);
+}
+if (!FileFactory.isFileExist(trashFolderWithTimestamp + 
CarbonCommonConstants
+.FILE_SEPARATOR + carbonFileToCopy.getName())) {
+  copyToTrashFolder(filePathToCopy, trashFolderWithTimestamp + 
CarbonCommonConstants
+  .FILE_SEPARATOR + carbonFileToCopy.getName());
+}
+  }
+} catch (IOException e) {
+  // in case there is any issue while copying the file to the trash 
folder, we need to delete
+  // the complete segment folder from the trash folder. The 
trashFolderWithTimestamp contains
+  // the segment folder too. Delete the folder as it is.
+  FileFactory.deleteFile(trashFolderWithTimestamp);
+  LOGGER.error("Error while creating trash folder or copying data to the 
trash folder", e);
+  throw e;
+}
+  }
+
+  /**
+   * The below method copies the complete segment folder to the trash folder. 
Here, the data files
+   * in segment are listed and copied one by one to the trash folder.
+   *
+   * @param segmentPath the folder which are to be moved to the trash folder
+   * @param trashFolderWithTimestamp trashfolderpath with complete timestamp 
and segment number
+   * @return
+   */
+  public static void copySegmentToTrash(CarbonFile segmentPath,
+  String trashFolderWithTimestamp) throws IOException {
+try {
+  List dataFiles = 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532396184



##
File path: core/src/main/java/org/apache/carbondata/core/util/TrashUtil.java
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.io.IOUtils;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the trash folder in carbondata. This class has methods to copy 
data to the trash and
+ * remove data from the trash.
+ */
+public final class TrashUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(TrashUtil.class.getName());
+
+  /**
+   * Base method to copy the data to the trash folder.
+   *
+   * @param sourcePath the path from which to copy the file
+   * @param destinationPath  the path where the file will be copied
+   * @return
+   */
+  private static void copyToTrashFolder(String sourcePath, String 
destinationPath)
+  throws IOException {
+DataOutputStream dataOutputStream = null;
+DataInputStream dataInputStream = null;
+try {
+  dataOutputStream = FileFactory.getDataOutputStream(destinationPath);
+  dataInputStream = FileFactory.getDataInputStream(sourcePath);
+  IOUtils.copyBytes(dataInputStream, dataOutputStream, 
CarbonCommonConstants.BYTEBUFFER_SIZE);
+} catch (IOException exception) {
+  LOGGER.error("Unable to copy " + sourcePath + " to the trash folder", 
exception);
+  throw exception;
+} finally {
+  CarbonUtil.closeStreams(dataInputStream, dataOutputStream);
+}
+  }
+
+  /**
+   * The below method copies the complete a file to the trash folder.
+   *
+   * @param filePathToCopy the files which are to be moved to the trash folder
+   * @param trashFolderWithTimestamptimestamp, partition folder(if any) 
and segment number
+   * @return
+   */
+  public static void copyFileToTrashFolder(String filePathToCopy,
+  String trashFolderWithTimestamp) throws IOException {
+CarbonFile carbonFileToCopy = FileFactory.getCarbonFile(filePathToCopy);
+try {
+  if (carbonFileToCopy.exists()) {
+if (!FileFactory.isFileExist(trashFolderWithTimestamp)) {
+  FileFactory.mkdirs(trashFolderWithTimestamp);
+}
+if (!FileFactory.isFileExist(trashFolderWithTimestamp + 
CarbonCommonConstants
+.FILE_SEPARATOR + carbonFileToCopy.getName())) {
+  copyToTrashFolder(filePathToCopy, trashFolderWithTimestamp + 
CarbonCommonConstants
+  .FILE_SEPARATOR + carbonFileToCopy.getName());
+}
+  }
+} catch (IOException e) {
+  // in case there is any issue while copying the file to the trash 
folder, we need to delete
+  // the complete segment folder from the trash folder. The 
trashFolderWithTimestamp contains
+  // the segment folder too. Delete the folder as it is.
+  FileFactory.deleteFile(trashFolderWithTimestamp);
+  LOGGER.error("Error while creating trash folder or copying data to the 
trash folder", e);
+  throw e;
+}
+  }
+
+  /**
+   * The below method copies the complete segment folder to the trash folder. 
Here, the data files
+   * in segment are listed and copied one by one to the trash folder.
+   *
+   * @param segmentPath the folder which are to be moved to the trash folder
+   * @param trashFolderWithTimestamp trashfolderpath with complete timestamp 
and segment number
+   * @return
+   */
+  public static void copySegmentToTrash(CarbonFile segmentPath,
+  String trashFolderWithTimestamp) throws IOException {
+try {
+  List dataFiles = 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532395502



##
File path: core/src/main/java/org/apache/carbondata/core/util/TrashUtil.java
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.io.IOUtils;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the trash folder in carbondata. This class has methods to copy 
data to the trash and
+ * remove data from the trash.
+ */
+public final class TrashUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(TrashUtil.class.getName());
+
+  /**
+   * Base method to copy the data to the trash folder.
+   *
+   * @param sourcePath the path from which to copy the file
+   * @param destinationPath  the path where the file will be copied
+   * @return
+   */
+  private static void copyToTrashFolder(String sourcePath, String 
destinationPath)
+  throws IOException {
+DataOutputStream dataOutputStream = null;
+DataInputStream dataInputStream = null;
+try {
+  dataOutputStream = FileFactory.getDataOutputStream(destinationPath);
+  dataInputStream = FileFactory.getDataInputStream(sourcePath);
+  IOUtils.copyBytes(dataInputStream, dataOutputStream, 
CarbonCommonConstants.BYTEBUFFER_SIZE);
+} catch (IOException exception) {
+  LOGGER.error("Unable to copy " + sourcePath + " to the trash folder", 
exception);
+  throw exception;
+} finally {
+  CarbonUtil.closeStreams(dataInputStream, dataOutputStream);
+}
+  }
+
+  /**
+   * The below method copies the complete a file to the trash folder.
+   *
+   * @param filePathToCopy the files which are to be moved to the trash folder
+   * @param trashFolderWithTimestamptimestamp, partition folder(if any) 
and segment number
+   * @return
+   */
+  public static void copyFileToTrashFolder(String filePathToCopy,
+  String trashFolderWithTimestamp) throws IOException {
+CarbonFile carbonFileToCopy = FileFactory.getCarbonFile(filePathToCopy);
+try {
+  if (carbonFileToCopy.exists()) {
+if (!FileFactory.isFileExist(trashFolderWithTimestamp)) {
+  FileFactory.mkdirs(trashFolderWithTimestamp);
+}
+if (!FileFactory.isFileExist(trashFolderWithTimestamp + 
CarbonCommonConstants
+.FILE_SEPARATOR + carbonFileToCopy.getName())) {
+  copyToTrashFolder(filePathToCopy, trashFolderWithTimestamp + 
CarbonCommonConstants
+  .FILE_SEPARATOR + carbonFileToCopy.getName());
+}

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #4000: [CARBONDATA-4020] Fixed drop index when multiple index exists

2020-11-29 Thread GitBox


nihal0107 commented on a change in pull request #4000:
URL: https://github.com/apache/carbondata/pull/4000#discussion_r532377244



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/index/bloom/BloomCoarseGrainIndexFunctionSuite.scala
##
@@ -660,6 +660,20 @@ class BloomCoarseGrainIndexFunctionSuite
   sql(s"SELECT * FROM $normalTable WHERE salary='1040'"))
   }
 
+  test("test drop index when more than one bloom index exists") {
+sql(s"CREATE TABLE $bloomSampleTable " +
+  "(id int,name string,salary int)STORED as carbondata 
TBLPROPERTIES('SORT_COLUMNS'='id')")
+sql(s"CREATE index index1 ON TABLE $bloomSampleTable(id) as 'bloomfilter' 
" +
+  "PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"CREATE index index2 ON TABLE $bloomSampleTable (name) as 
'bloomfilter' " +
+  "PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"insert into $bloomSampleTable values(1,'nihal',20)")
+sql(s"SHOW INDEXES ON TABLE $bloomSampleTable").collect()
+checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, 
"index1", "index2")
+sql(s"drop index index1 on $bloomSampleTable")
+checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, 
"index2")

Review comment:
   done

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/index/bloom/BloomCoarseGrainIndexFunctionSuite.scala
##
@@ -660,6 +660,20 @@ class BloomCoarseGrainIndexFunctionSuite
   sql(s"SELECT * FROM $normalTable WHERE salary='1040'"))
   }
 
+  test("test drop index when more than one bloom index exists") {
+sql(s"CREATE TABLE $bloomSampleTable " +
+  "(id int,name string,salary int)STORED as carbondata 
TBLPROPERTIES('SORT_COLUMNS'='id')")
+sql(s"CREATE index index1 ON TABLE $bloomSampleTable(id) as 'bloomfilter' 
" +
+  "PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"CREATE index index2 ON TABLE $bloomSampleTable (name) as 
'bloomfilter' " +
+  "PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"insert into $bloomSampleTable values(1,'nihal',20)")
+sql(s"SHOW INDEXES ON TABLE $bloomSampleTable").collect()

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532370448



##
File path: core/src/main/java/org/apache/carbondata/core/util/TrashUtil.java
##
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.DataInputStream;
+import java.io.DataOutputStream;
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.io.IOUtils;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the trash folder in carbondata. This class has methods to copy 
data to the trash and
+ * remove data from the trash.
+ */
+public final class TrashUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(TrashUtil.class.getName());
+
+  /**
+   * Base method to copy the data to the trash folder.
+   *
+   * @param sourcePath the path from which to copy the file
+   * @param destinationPath  the path where the file will be copied
+   * @return
+   */
+  private static void copyToTrashFolder(String sourcePath, String 
destinationPath)
+  throws IOException {
+DataOutputStream dataOutputStream = null;
+DataInputStream dataInputStream = null;
+try {
+  dataOutputStream = FileFactory.getDataOutputStream(destinationPath);
+  dataInputStream = FileFactory.getDataInputStream(sourcePath);
+  IOUtils.copyBytes(dataInputStream, dataOutputStream, 
CarbonCommonConstants.BYTEBUFFER_SIZE);
+} catch (IOException exception) {
+  LOGGER.error("Unable to copy " + sourcePath + " to the trash folder", 
exception);
+  throw exception;
+} finally {
+  CarbonUtil.closeStreams(dataInputStream, dataOutputStream);
+}
+  }
+
+  /**
+   * The below method copies the complete a file to the trash folder.
+   *
+   * @param filePathToCopy the files which are to be moved to the trash folder
+   * @param trashFolderWithTimestamptimestamp, partition folder(if any) 
and segment number
+   * @return
+   */
+  public static void copyFileToTrashFolder(String filePathToCopy,
+  String trashFolderWithTimestamp) throws IOException {
+CarbonFile carbonFileToCopy = FileFactory.getCarbonFile(filePathToCopy);
+try {
+  if (carbonFileToCopy.exists()) {
+if (!FileFactory.isFileExist(trashFolderWithTimestamp)) {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #4000: [CARBONDATA-4020] Fixed drop index when multiple index exists

2020-11-29 Thread GitBox


VenuReddy2103 commented on a change in pull request #4000:
URL: https://github.com/apache/carbondata/pull/4000#discussion_r532368311



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/index/bloom/BloomCoarseGrainIndexFunctionSuite.scala
##
@@ -660,6 +660,20 @@ class BloomCoarseGrainIndexFunctionSuite
   sql(s"SELECT * FROM $normalTable WHERE salary='1040'"))
   }
 
+  test("test drop index when more than one bloom index exists") {
+sql(s"CREATE TABLE $bloomSampleTable " +
+  "(id int,name string,salary int)STORED as carbondata 
TBLPROPERTIES('SORT_COLUMNS'='id')")
+sql(s"CREATE index index1 ON TABLE $bloomSampleTable(id) as 'bloomfilter' 
" +
+  "PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"CREATE index index2 ON TABLE $bloomSampleTable (name) as 
'bloomfilter' " +
+  "PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"insert into $bloomSampleTable values(1,'nihal',20)")
+sql(s"SHOW INDEXES ON TABLE $bloomSampleTable").collect()

Review comment:
   This line can be removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4030: [WIP][CARBONDATA-4064] Fix tpcds query failure with SI

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4030:
URL: https://github.com/apache/carbondata/pull/4030#discussion_r532368011



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/optimizer/CarbonSecondaryIndexOptimizer.scala
##
@@ -943,7 +943,11 @@ class CarbonSecondaryIndexOptimizer(sparkSession: 
SparkSession) {
 val filterAttributes = filter.condition collect {
   case attr: AttributeReference => attr.name.toLowerCase
 }
-val parentTableRelation = MatchIndexableRelation.unapply(filter.child).get
+val parentRelation = MatchIndexableRelation.unapply(filter.child)
+if (parentRelation.isEmpty) {
+  return false
+}
+val parentTableRelation = parentRelation.get

Review comment:
   ok, keep PR in WIP to avoid comment on in-progress PR. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4027: [WIP]added compression and range column based FT for SI

2020-11-29 Thread GitBox


ajantha-bhat commented on pull request #4027:
URL: https://github.com/apache/carbondata/pull/4027#issuecomment-735573280


   @nihal0107 : I think all the SI related testcases you can add in your 
existing #4023 itself. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4030: [CARBONDATA-4064] Fix tpcds query failure with SI

2020-11-29 Thread GitBox


Indhumathi27 commented on a change in pull request #4030:
URL: https://github.com/apache/carbondata/pull/4030#discussion_r532367206



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/optimizer/CarbonSecondaryIndexOptimizer.scala
##
@@ -943,7 +943,11 @@ class CarbonSecondaryIndexOptimizer(sparkSession: 
SparkSession) {
 val filterAttributes = filter.condition collect {
   case attr: AttributeReference => attr.name.toLowerCase
 }
-val parentTableRelation = MatchIndexableRelation.unapply(filter.child).get
+val parentRelation = MatchIndexableRelation.unapply(filter.child)
+if (parentRelation.isEmpty) {
+  return false
+}
+val parentTableRelation = parentRelation.get

Review comment:
   yes.. in progress





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4000: [CARBONDATA-4020] Fixed drop index when multiple index exists

2020-11-29 Thread GitBox


Indhumathi27 commented on a change in pull request #4000:
URL: https://github.com/apache/carbondata/pull/4000#discussion_r532367120



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/index/bloom/BloomCoarseGrainIndexFunctionSuite.scala
##
@@ -660,6 +660,20 @@ class BloomCoarseGrainIndexFunctionSuite
   sql(s"SELECT * FROM $normalTable WHERE salary='1040'"))
   }
 
+  test("test drop index when more than one bloom index exists") {
+sql(s"CREATE TABLE $bloomSampleTable " +
+  "(id int,name string,salary int)STORED as carbondata 
TBLPROPERTIES('SORT_COLUMNS'='id')")
+sql(s"CREATE index index1 ON TABLE $bloomSampleTable(id) as 'bloomfilter' 
" +
+  "PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"CREATE index index2 ON TABLE $bloomSampleTable (name) as 
'bloomfilter' " +
+  "PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true')")
+sql(s"insert into $bloomSampleTable values(1,'nihal',20)")
+sql(s"SHOW INDEXES ON TABLE $bloomSampleTable").collect()
+checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, 
"index1", "index2")
+sql(s"drop index index1 on $bloomSampleTable")
+checkExistence(sql(s"SHOW INDEXES ON TABLE $bloomSampleTable"), true, 
"index2")

Review comment:
   add case for drop all cg/fg indexes also





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4030: [CARBONDATA-4064] Fix tpcds query failure with SI

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4030:
URL: https://github.com/apache/carbondata/pull/4030#discussion_r532366408



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/optimizer/CarbonSecondaryIndexOptimizer.scala
##
@@ -943,7 +943,11 @@ class CarbonSecondaryIndexOptimizer(sparkSession: 
SparkSession) {
 val filterAttributes = filter.condition collect {
   case attr: AttributeReference => attr.name.toLowerCase
 }
-val parentTableRelation = MatchIndexableRelation.unapply(filter.child).get
+val parentRelation = MatchIndexableRelation.unapply(filter.child)
+if (parentRelation.isEmpty) {
+  return false
+}
+val parentTableRelation = parentRelation.get

Review comment:
   can you please add a small testcase for this and update the PR 
description about when the parent Relation can be Empty.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4020:
URL: https://github.com/apache/carbondata/pull/4020#discussion_r532364698



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/MajorCompactionIgnoreInMinorTest.scala
##
@@ -186,6 +187,206 @@ class MajorCompactionIgnoreInMinorTest extends QueryTest 
with BeforeAndAfterAll
 
   }
 
+  def generateData(numOrders: Int = 10): DataFrame = {
+import sqlContext.implicits._
+sqlContext.sparkContext.parallelize(1 to numOrders, 4)
+  .map { x => ("country" + x, x, "07/23/2015", "name" + x, "phonetype" + x 
% 10,
+"serialname" + x, x + 1)
+  }.toDF("country", "ID", "date", "name", "phonetype", "serialname", 
"salary")
+  }
+
+  test("test skip segment whose data size exceed threshold in minor compaction 
" +
+"in system level control") {
+
CarbonProperties.getInstance().addProperty("carbon.compaction.level.threshold", 
"2,2")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "mm/dd/")
+// set threshold to 1MB in this test case
+CarbonProperties.getInstance().addProperty("carbon.minor.compaction.size", 
"1")
+
+sql("drop table if exists  minor_threshold")
+sql("drop table if exists  tmp")
+
+sql(
+  "CREATE TABLE IF NOT EXISTS minor_threshold (country String, ID Int, 
date Timestamp," +
+" name String, phonetype String, serialname String, salary Int) STORED 
AS carbondata"
+)
+sql(
+  "CREATE TABLE IF NOT EXISTS tmp (country String, ID Int, date 
Timestamp," +
+" name String, phonetype String, serialname String, salary Int) STORED 
AS carbondata"
+)
+
+val initframe = generateData(10)
+initframe.write
+  .format("carbondata")
+  .option("tablename", "tmp")
+  .mode(SaveMode.Overwrite)
+  .save()
+// load 3 segments
+sql("LOAD DATA LOCAL INPATH '" + csvFilePath1 + "' INTO TABLE 
minor_threshold OPTIONS" +
+  "('DELIMITER'= ',', 'QUOTECHAR'= '\"')"
+)
+sql("LOAD DATA LOCAL INPATH '" + csvFilePath2 + "' INTO TABLE 
minor_threshold  OPTIONS" +
+  "('DELIMITER'= ',', 'QUOTECHAR'= '\"')"
+)
+sql("LOAD DATA LOCAL INPATH '" + csvFilePath1 + "' INTO TABLE 
minor_threshold OPTIONS" +
+  "('DELIMITER'= ',', 'QUOTECHAR'= '\"')"
+)
+
+// insert a new segment(id is 3) data size exceed 1 MB
+sql("insert into minor_threshold select * from tmp")
+
+// load another 3 segments
+sql("LOAD DATA LOCAL INPATH '" + csvFilePath1 + "' INTO TABLE 
minor_threshold OPTIONS" +
+  "('DELIMITER'= ',', 'QUOTECHAR'= '\"')"
+)
+sql("LOAD DATA LOCAL INPATH '" + csvFilePath2 + "' INTO TABLE 
minor_threshold  OPTIONS" +
+  "('DELIMITER'= ',', 'QUOTECHAR'= '\"')"
+)
+sql("LOAD DATA LOCAL INPATH '" + csvFilePath1 + "' INTO TABLE 
minor_threshold OPTIONS" +
+  "('DELIMITER'= ',', 'QUOTECHAR'= '\"')"
+)
+
+sql("show segments for table minor_threshold").show(100, false)
+// do minor compaction
+sql("alter table minor_threshold compact 'minor'"
+)
+// check segment 3 whose size exceed the limit should not be compacted
+val carbonTable = CarbonMetadata.getInstance().getCarbonTable(
+  CarbonCommonConstants.DATABASE_DEFAULT_NAME, "minor_threshold")
+val carbonTablePath = carbonTable.getMetadataPath
+val segments = SegmentStatusManager.readLoadMetadata(carbonTablePath);
+assertResult(SegmentStatus.SUCCESS)(segments(3).getSegmentStatus)
+assertResult(100030)(sql("select count(*) from 
minor_threshold").collect().head.get(0))
+// reset to 0
+CarbonProperties.getInstance().addProperty("carbon.minor.compaction.size", 
"0")
+  }

Review comment:
   support dynamically changing the table property also, by alter table 
set/unset tblpeoperties command. With that in the same testcase you can test 
table level by loading some more data. No need to create new tables again to 
test it 

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/datacompaction/MajorCompactionIgnoreInMinorTest.scala
##
@@ -186,6 +187,206 @@ class MajorCompactionIgnoreInMinorTest extends QueryTest 
with BeforeAndAfterAll
 
   }
 
+  def generateData(numOrders: Int = 10): DataFrame = {
+import sqlContext.implicits._
+sqlContext.sparkContext.parallelize(1 to numOrders, 4)
+  .map { x => ("country" + x, x, "07/23/2015", "name" + x, "phonetype" + x 
% 10,
+"serialname" + x, x + 1)
+  }.toDF("country", "ID", "date", "name", "phonetype", "serialname", 
"salary")
+  }
+
+  test("test skip segment whose data size exceed threshold in minor compaction 
" +
+"in system level control") {
+
CarbonProperties.getInstance().addProperty("carbon.compaction.level.threshold", 
"2,2")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532364299



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);
+  }
+  // delete the segment file as well
+  
FileFactory.deleteFile(CarbonTablePath.getSegmentFilePath(carbonTable.getTablePath(),
+  staleSegment));
+}
+  }
+  staleSegments.clear();
+}
+  }
+
+  /**
+   * This method will clean all the stale segments for partition table, delete 
the source folders
+   * after copying the data to the trash and also remove the .segment files of 
the stale segments
+   */
+  public static void cleanStaleSegmentsForPartitionTable(CarbonTable 
carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532364095



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);
+  }
+  // delete the segment file as well
+  
FileFactory.deleteFile(CarbonTablePath.getSegmentFilePath(carbonTable.getTablePath(),
+  staleSegment));
+}
+  }
+  staleSegments.clear();
+}
+  }
+
+  /**
+   * This method will clean all the stale segments for partition table, delete 
the source folders
+   * after copying the data to the trash and also remove the .segment files of 
the stale segments
+   */
+  public static void cleanStaleSegmentsForPartitionTable(CarbonTable 
carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+

[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4000: [CARBONDATA-4020] Fixed drop index when multiple index exists

2020-11-29 Thread GitBox


Indhumathi27 commented on a change in pull request #4000:
URL: https://github.com/apache/carbondata/pull/4000#discussion_r532363660



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/index/DropIndexCommand.scala
##
@@ -183,17 +183,20 @@ private[sql] case class DropIndexCommand(
   parentCarbonTable)
 parentCarbonTable = getRefreshedParentTable(sparkSession, dbName)
 val indexMetadata = parentCarbonTable.getIndexMetadata
+var hasCgFgIndexes = false

Review comment:
   Can keep old code only i think, as  hasCgFgIndexes is not used in 
multiple places. Just change hasCgFgIndexes logic.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532363133



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);

Review comment:
   done

##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the 

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4020:
URL: https://github.com/apache/carbondata/pull/4020#discussion_r532363057



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
##
@@ -191,6 +191,10 @@ private[sql] case class CarbonDescribeFormattedCommand(
   CarbonProperties.getInstance()
 .getProperty(CarbonCommonConstants.CARBON_MAJOR_COMPACTION_SIZE,
   CarbonCommonConstants.DEFAULT_CARBON_MAJOR_COMPACTION_SIZE)), 
""),
+  (CarbonCommonConstants.TABLE_MINOR_COMPACTION_SIZE.toUpperCase,
+tblProps.getOrElse(CarbonCommonConstants.TABLE_MINOR_COMPACTION_SIZE,
+  CarbonProperties.getInstance()
+.getProperty(CarbonCommonConstants.CARBON_MINOR_COMPACTION_SIZE, 
"0")), ""),

Review comment:
   ok, lets keep min configurable to 1 MB and if the user not configured, 
we can keep -1. because 0 means, as per property skip all the segments that are 
greater than 0 MB.  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4020:
URL: https://github.com/apache/carbondata/pull/4020#discussion_r532362600



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##
@@ -292,6 +293,33 @@ object CommonUtil {
 }
   }
 
+  /**
+   * This method will validate the minor compaction size specified by the user
+   * the property is used while doing minor compaction
+   *
+   * @param tableProperties
+   */
+  def validateMinorCompactionSize(tableProperties: Map[String, String]): Unit 
= {
+var minorCompactionSize: Integer = 0
+val tblPropName = CarbonCommonConstants.TABLE_MINOR_COMPACTION_SIZE
+if (tableProperties.get(tblPropName).isDefined) {
+  val minorCompactionSizeStr: String =
+parsePropertyValueStringInMB(tableProperties(tblPropName))
+  try {
+minorCompactionSize = Integer.parseInt(minorCompactionSizeStr)
+  } catch {
+case e: NumberFormatException =>
+  throw new MalformedCarbonCommandException(s"Invalid $tblPropName 
value found: " +
+s"$minorCompactionSizeStr, only int value greater than 0 is 
supported.")
+  }
+  if (minorCompactionSize < 0) {

Review comment:
   if zero is not supported, then <= 0





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4020:
URL: https://github.com/apache/carbondata/pull/4020#discussion_r532362493



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
##
@@ -292,6 +293,33 @@ object CommonUtil {
 }
   }
 
+  /**
+   * This method will validate the minor compaction size specified by the user
+   * the property is used while doing minor compaction
+   *
+   * @param tableProperties
+   */
+  def validateMinorCompactionSize(tableProperties: Map[String, String]): Unit 
= {
+var minorCompactionSize: Integer = 0
+val tblPropName = CarbonCommonConstants.TABLE_MINOR_COMPACTION_SIZE
+if (tableProperties.get(tblPropName).isDefined) {
+  val minorCompactionSizeStr: String =
+parsePropertyValueStringInMB(tableProperties(tblPropName))
+  try {
+minorCompactionSize = Integer.parseInt(minorCompactionSizeStr)
+  } catch {
+case e: NumberFormatException =>
+  throw new MalformedCarbonCommandException(s"Invalid $tblPropName 
value found: " +
+s"$minorCompactionSizeStr, only int value greater than 0 is 
supported.")

Review comment:
   0 is also supported right ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4020:
URL: https://github.com/apache/carbondata/pull/4020#discussion_r532362299



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
##
@@ -998,6 +998,23 @@ public long getMajorCompactionSize() {
 return compactionSize;
   }
 
+  /**
+   * returns minor compaction size value from carbon properties or 0 if it is 
not valid or
+   * not configured
+   *
+   * @return compactionSize
+   */
+  public long getMinorCompactionSize() {
+long compactionSize = 0;
+try {
+  compactionSize = Long.parseLong(getProperty(

Review comment:
   please handle if user sets negative value.
   user can set from 0 to LONG_MAX I think.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4020: [CARBONDATA-4054] Support data size control for minor compaction

2020-11-29 Thread GitBox


ajantha-bhat commented on a change in pull request #4020:
URL: https://github.com/apache/carbondata/pull/4020#discussion_r532361537



##
File path: docs/dml-of-carbondata.md
##
@@ -529,6 +529,10 @@ CarbonData DML statements are documented here,which 
includes:
   * Level 1: Merging of the segments which are not yet compacted.
   * Level 2: Merging of the compacted segments again to form a larger segment.
 
+  The segment whose data size exceed limit of carbon.minor.compaction.size 
will not be included in
+  minor compaction. If user want to control the size of segment included in 
minor compaction,
+  configure the property with appropriate value in MB, if not configure, will 
merge segments only
+  based on num of segments.

Review comment:
   ```suggestion
 based on number of segments.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4064) TPCDS queries are failing with NOne.get exception when table has SI configured

2020-11-29 Thread Indhumathi Muthu Murugesh (Jira)
Indhumathi Muthu Murugesh created CARBONDATA-4064:
-

 Summary: TPCDS queries are failing with NOne.get exception when 
table has SI configured
 Key: CARBONDATA-4064
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4064
 Project: CarbonData
  Issue Type: Bug
Reporter: Indhumathi Muthu Murugesh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532360958



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);
+  }
+  // delete the segment file as well
+  
FileFactory.deleteFile(CarbonTablePath.getSegmentFilePath(carbonTable.getTablePath(),
+  staleSegment));
+}
+  }
+  staleSegments.clear();
+}
+  }
+
+  /**
+   * This method will clean all the stale segments for partition table, delete 
the source folders
+   * after copying the data to the trash and also remove the .segment files of 
the stale segments
+   */
+  public static void cleanStaleSegmentsForPartitionTable(CarbonTable 
carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532359426



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532357875



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);

Review comment:
   done

##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532357801



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {
+  for (String staleSegment : staleSegments) {
+String segmentNumber = 
staleSegment.split(CarbonCommonConstants.UNDERSCORE)[0];
+SegmentFileStore fileStore = new 
SegmentFileStore(carbonTable.getTablePath(),
+staleSegment);
+Map locationMap = 
fileStore.getSegmentFile()
+.getLocationMap();
+if (locationMap != null) {
+  CarbonFile segmentLocation = 
FileFactory.getCarbonFile(carbonTable.getTablePath() +
+  CarbonCommonConstants.FILE_SEPARATOR + 
fileStore.getSegmentFile().getLocationMap()
+  .entrySet().iterator().next().getKey());
+  // copy the complete segment to the trash folder
+  TrashUtil.copySegmentToTrash(segmentLocation, 
CarbonTablePath.getTrashFolderPath(
+  carbonTable.getTablePath()) + 
CarbonCommonConstants.FILE_SEPARATOR +
+  timeStampForTrashFolder + CarbonCommonConstants.FILE_SEPARATOR + 
CarbonTablePath
+  .SEGMENT_PREFIX + segmentNumber);
+  // Deleting the stale Segment folders.
+  try {
+CarbonUtil.deleteFoldersAndFiles(segmentLocation);
+  } catch (IOException | InterruptedException e) {
+LOGGER.error("Unable to delete the segment: " + segmentNumber + " 
from after moving" +
+" it to the trash folder. Please delete them manually : " + 
e.getMessage(), e);
+  }
+  // delete the segment file as well
+  
FileFactory.deleteFile(CarbonTablePath.getSegmentFilePath(carbonTable.getTablePath(),
+  staleSegment));
+}
+  }
+  staleSegments.clear();

Review comment:
   yes, can be removed

##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #4005: [CARBONDATA-3978] Trash Folder support in carbondata

2020-11-29 Thread GitBox


vikramahuja1001 commented on a change in pull request #4005:
URL: https://github.com/apache/carbondata/pull/4005#discussion_r532357687



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.

Review comment:
   done

##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CleanFilesUtil.java
##
@@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.metadata.SegmentFileStore;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
+import org.apache.carbondata.core.statusmanager.LoadMetadataDetails;
+import org.apache.carbondata.core.statusmanager.SegmentStatus;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.Logger;
+
+/**
+ * Mantains the clean files command in carbondata. This class has methods for 
clean files
+ * operation.
+ */
+public class CleanFilesUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CleanFilesUtil.class.getName());
+
+  /**
+   * This method will clean all the stale segments for a table, delete the 
source folder after
+   * copying the data to the trash and also remove the .segment files of the 
stale segments
+   */
+  public static void cleanStaleSegments(CarbonTable carbonTable)
+throws IOException {
+long timeStampForTrashFolder = CarbonUpdateUtil.readCurrentTime();
+List staleSegments = getStaleSegments(carbonTable);
+if (staleSegments.size() > 0) {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure 

[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4013: [CARBONDATA-4062] Make clean files become data trash manager

2020-11-29 Thread GitBox


CarbonDataQA2 commented on pull request #4013:
URL: https://github.com/apache/carbondata/pull/4013#issuecomment-735523234


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/3213/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4013: [CARBONDATA-4062] Make clean files become data trash manager

2020-11-29 Thread GitBox


CarbonDataQA2 commented on pull request #4013:
URL: https://github.com/apache/carbondata/pull/4013#issuecomment-735521831


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4968/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4063) Refactor getBlockId and getShortBlockId function

2020-11-29 Thread Xingjun Hao (Jira)
Xingjun Hao created CARBONDATA-4063:
---

 Summary: Refactor getBlockId and getShortBlockId function
 Key: CARBONDATA-4063
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4063
 Project: CarbonData
  Issue Type: Improvement
Reporter: Xingjun Hao


Now. getBlockId and getShortBlockId functions are too complex and unreadable.

Which need to be simpler and readable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)