date:20200119

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-576119537
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1697/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of 
table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-576118520
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1696/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of 
table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#issuecomment-576114924
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1695/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

2020-01-19 Thread GitBox

kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided 
listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368375838
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala
 ##
 @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand(
 // then do the below steps
 // 2.2.1 validate that all the aggregate tables are copied at the store 
location.
 // 2.2.2 Register the aggregate tables
-val tablePath = CarbonEnv.getTablePath(databaseNameOp, 
tableName.toLowerCase)(sparkSession)
-val identifier = AbsoluteTableIdentifier.from(tablePath, databaseName, 
tableName.toLowerCase)
 // 2.1 check if the table already register with hive then ignore and 
continue with the next
 // schema
-if (!sparkSession.sessionState.catalog.listTables(databaseName)
-  .exists(_.table.equalsIgnoreCase(tableName))) {
+val provider = try {
+  sparkSession.sessionState.catalog
+.getTableMetadata(TableIdentifier(tableName, databaseNameOp)).provider
+} catch {
+  case _: NoSuchTableException =>
+None
+}
+if (provider.isEmpty ||
+provider.get.equalsIgnoreCase("org.apache.spark.sql.CarbonSource") ||
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

2020-01-19 Thread GitBox

kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided 
listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368375832
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala
 ##
 @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand(
 // then do the below steps
 // 2.2.1 validate that all the aggregate tables are copied at the store 
location.
 // 2.2.2 Register the aggregate tables
-val tablePath = CarbonEnv.getTablePath(databaseNameOp, 
tableName.toLowerCase)(sparkSession)
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3538: [WIP] Separate Insert and load to later optimize insert.

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3538: [WIP] Separate Insert and load to later 
optimize insert.
URL: https://github.com/apache/carbondata/pull/3538#issuecomment-576102784
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1694/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Resolved] (CARBONDATA-3503) Adapt to SparkSessionExtensions

2020-01-19 Thread Jacky Li (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-3503.
--
Fix Version/s: 2.0.0
   Resolution: Fixed

> Adapt to SparkSessionExtensions
> ---
>
> Key: CARBONDATA-3503
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3503
> Project: CarbonData
>  Issue Type: New Feature
>Affects Versions: 1.5.4
>Reporter: Ajith S
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 41h 20m
>  Remaining Estimate: 0h
>
> From https://issues.apache.org/jira/browse/SPARK-18127 Spark provides 
> SparkSessionExtensions in order to extended capabilities of spark. Carbon can 
> use this in order to avoid the tight coupling due to CarbonSession in spark 
> environment. 
> https://spark.apache.org/docs/2.4.3/api/java/org/apache/spark/sql/SparkSessionExtensions.html
> This JIRA propose the currently in use getOrCreateCarbonSession to be 
> dropped. Which will make it incompatible change to carbon 1.x



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] jackylk commented on issue #3479: [CARBONDATA-3271] Integrating deep learning framework TensorFlow

2020-01-19 Thread GitBox

jackylk commented on issue #3479: [CARBONDATA-3271] Integrating deep learning 
framework TensorFlow
URL: https://github.com/apache/carbondata/pull/3479#issuecomment-576024535
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] asfgit closed pull request #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

asfgit closed pull request #3574: [CARBONDATA-3503] Optimize Carbon 
SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-576023794
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1693/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] jackylk commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

jackylk commented on issue #3574: [CARBONDATA-3503] Optimize Carbon 
SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574#issuecomment-576021770
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon 
SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574#issuecomment-576008017
 
 
   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1692/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] QiangCai commented on a change in pull request #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

QiangCai commented on a change in pull request #3574: [CARBONDATA-3503] 
Optimize Carbon SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574#discussion_r368291356
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
 ##
 @@ -1100,6 +1101,62 @@ public static void 
deleteLoadsAndUpdateMetadata(CarbonTable carbonTable, boolean
 }
   }
 
+  public static void truncateTable(CarbonTable carbonTable)
+  throws ConcurrentOperationException, IOException {
+ICarbonLock carbonTableStatusLock = CarbonLockFactory.getCarbonLockObj(
+carbonTable.getAbsoluteTableIdentifier(), LockUsage.TABLE_STATUS_LOCK);
+boolean locked = false;
+try {
+  // Update load metadate file after cleaning deleted nodes
+  locked = carbonTableStatusLock.lockWithRetries();
+  if (locked) {
+LOG.info("Table status lock has been successfully acquired.");
+LoadMetadataDetails[] listOfLoadFolderDetailsArray =
+SegmentStatusManager.readLoadMetadata(
+CarbonTablePath.getMetadataPath(carbonTable.getTablePath()));
+for (LoadMetadataDetails listOfLoadFolderDetails : 
listOfLoadFolderDetailsArray) {
+  boolean writing;
+  switch (listOfLoadFolderDetails.getSegmentStatus()) {
+case INSERT_IN_PROGRESS:
+  writing = true;
+  break;
+case INSERT_OVERWRITE_IN_PROGRESS:
+  writing = true;
+  break;
+case STREAMING:
+  writing = true;
+  break;
+default:
+  writing = false;
+  }
+  if (writing) {
+throw new ConcurrentOperationException(carbonTable, "insert", 
"truncate");
+  }
+}
 
 Review comment:
   fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3581: [CARBONDATA-3666] 
Avoided listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368288557
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala
 ##
 @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand(
 // then do the below steps
 // 2.2.1 validate that all the aggregate tables are copied at the store 
location.
 // 2.2.2 Register the aggregate tables
-val tablePath = CarbonEnv.getTablePath(databaseNameOp, 
tableName.toLowerCase)(sparkSession)
 
 Review comment:
   the above comments are outdated and should be updated to keep up with your 
modification


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3584: [WIP] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3584: [WIP] Support 
SegmentLevel MinMax for better Pruning and less driver memory usage for cache
URL: https://github.com/apache/carbondata/pull/3584#discussion_r368287969
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/util/SegmentMinMax.java
 ##
 @@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util;
+
+import java.io.Serializable;
+import java.util.Map;
+
+/**
+ * Holds Min, Max and columnCardinality values for each segment block
+ */
+public class SegmentMinMax implements Serializable {
 
 Review comment:
   It is a bean, why place it under **util** package?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3574: [CARBONDATA-3503] 
Optimize Carbon SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574#discussion_r368287460
 
 

 ##
 File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
 ##
 @@ -1100,6 +1101,62 @@ public static void 
deleteLoadsAndUpdateMetadata(CarbonTable carbonTable, boolean
 }
   }
 
+  public static void truncateTable(CarbonTable carbonTable)
+  throws ConcurrentOperationException, IOException {
+ICarbonLock carbonTableStatusLock = CarbonLockFactory.getCarbonLockObj(
+carbonTable.getAbsoluteTableIdentifier(), LockUsage.TABLE_STATUS_LOCK);
+boolean locked = false;
+try {
+  // Update load metadate file after cleaning deleted nodes
+  locked = carbonTableStatusLock.lockWithRetries();
+  if (locked) {
+LOG.info("Table status lock has been successfully acquired.");
+LoadMetadataDetails[] listOfLoadFolderDetailsArray =
+SegmentStatusManager.readLoadMetadata(
+CarbonTablePath.getMetadataPath(carbonTable.getTablePath()));
+for (LoadMetadataDetails listOfLoadFolderDetails : 
listOfLoadFolderDetailsArray) {
+  boolean writing;
+  switch (listOfLoadFolderDetails.getSegmentStatus()) {
+case INSERT_IN_PROGRESS:
+  writing = true;
+  break;
+case INSERT_OVERWRITE_IN_PROGRESS:
+  writing = true;
+  break;
+case STREAMING:
+  writing = true;
+  break;
+default:
+  writing = false;
+  }
+  if (writing) {
+throw new ConcurrentOperationException(carbonTable, "insert", 
"truncate");
+  }
+}
 
 Review comment:
   can be optimized to reduce lines of code


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3550: [CARBONDATA-3605] Remove global dictionary in query

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3550: [CARBONDATA-3605] Remove global 
dictionary in query
URL: https://github.com/apache/carbondata/pull/3550#issuecomment-575993819
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1691/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon 
SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574#issuecomment-575988147
 
 
   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1690/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] 
Support spark 2.4 integration
URL: https://github.com/apache/carbondata/pull/3576#discussion_r368281422
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDropTableCommand.scala
 ##
 @@ -134,6 +134,7 @@ case class CarbonDropTableCommand(
   }
   val indexDatamapSchemas =
 DataMapStoreManager.getInstance().getDataMapSchemasOfTable(carbonTable)
+  LOGGER.info(s"Dropping DataMaps in table $tableName, size: " + 
indexDatamapSchemas.size())
 
 Review comment:
   why not
   
   LOGGER.info(s"Dropping DataMaps in table $tableName, size: 
${indexDatamapSchemas.size()}")


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] 
Support spark 2.4 integration
URL: https://github.com/apache/carbondata/pull/3576#discussion_r368280039
 
 

 ##
 File path: README.md
 ##
 @@ -28,8 +28,8 @@ Visit count: 
[![HitCount](http://hits.dwyl.io/jackylk/apache/carbondata.svg)](ht
 
 
 ## Status
-Spark2.2:
-[![Build 
Status](https://builds.apache.org/buildStatus/icon?job=carbondata-master-spark-2.2)](https://builds.apache.org/view/A-D/view/CarbonData/job/carbondata-master-spark-2.2/lastBuild/testReport)
+Spark2.3:
+[![Build 
Status](https://builds.apache.org/buildStatus/icon?job=carbondata-master-spark-2.3)](https://builds.apache.org/view/A-D/view/CarbonData/job/carbondata-master-spark-2.2/lastBuild/testReport)
 
 Review comment:
   2.2 is still in url. Is it a mistake?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] 
Support spark 2.4 integration
URL: https://github.com/apache/carbondata/pull/3576#discussion_r368280213
 
 

 ##
 File path: build/README.md
 ##
 @@ -25,11 +25,9 @@
 * [Apache Thrift 0.9.3](http://archive.apache.org/dist/thrift/0.9.3/)
 
 Review comment:
   since we are using jdk-8, maybe we can make use of java-8 feature in code 
later, such as stream and lambda


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] 
Support spark 2.4 integration
URL: https://github.com/apache/carbondata/pull/3576#discussion_r368281590
 
 

 ##
 File path: pom.xml
 ##
 @@ -575,12 +530,14 @@
 
${basedir}/processing/src/main/java
 
${basedir}/hadoop/src/main/java
 
${basedir}/integration/spark2/src/main/scala
-
${basedir}/integration/spark2/src/main/spark2.2
-
${basedir}/integration/spark2/src/main/commonTo2.1And2.2
-
${basedir}/integration/spark2/src/main/commonTo2.2And2.3
+
${basedir}/integration/spark2/src/main/commonTo2.2AndAbove
 
 Review comment:
   commonTo2.2AndAbove? still has 2.2 compatibility?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

2020-01-19 Thread GitBox

xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] 
Support spark 2.4 integration
URL: https://github.com/apache/carbondata/pull/3576#discussion_r368281363
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateDataSourceTableCommand.scala
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.table
+
+import org.apache.spark.sql.{CarbonEnv, CarbonSource, Row, SparkSession}
+import org.apache.spark.sql.catalyst.catalog.CatalogTable
+import org.apache.spark.sql.execution.command.{CreateDataSourceTableCommand, 
MetadataCommand}
+
+/**
+ * Command to create table in case of 'USING CARBONDATA' DDL
+ *
+ * @param catalogTable catalog table created by spark
+ * @param ignoreIfExists ignore if table exists
+ * @param sparkSession spark session
+ */
+case class CarbonCreateDataSourceTableCommand(
+catalogTable: CatalogTable,
+ignoreIfExists: Boolean,
+sparkSession: SparkSession)
+  extends MetadataCommand {
+
+  override def processMetadata(session: SparkSession): Seq[Row] = {
+// Run the spark command to create table in metastore before saving carbon 
schema
+// in table path.
+// This is required for spark 2.4, because spark 2.4 will fail to create 
table
+// if table path is created before hand
 
 Review comment:
   'before hand' --> 'beforehand'


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in 
Hive 
URL: https://github.com/apache/carbondata/pull/3583#issuecomment-575986699
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1689/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Resolved] (CARBONDATA-3645) BadRecords are inserted as NULL when column is of complex data type and BAD_RECORDS_ACTION is IGNORE

2020-01-19 Thread Jacky Li (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-3645.
--
Fix Version/s: 2.0.0
   Resolution: Fixed

> BadRecords are inserted as NULL when column is of complex data type and 
> BAD_RECORDS_ACTION is IGNORE
> 
>
> Key: CARBONDATA-3645
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3645
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (CARBONDATA-3667) Insert stage recover processing of the partition table throw exception “the unexpected 0 segment found”

2020-01-19 Thread Xingjun Hao (Jira)

Xingjun Hao created CARBONDATA-3667:
---

 Summary: Insert stage recover processing of the partition table 
throw exception “the unexpected 0 segment found”
 Key: CARBONDATA-3667
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3667
 Project: CarbonData
  Issue Type: Bug
  Components: core
Affects Versions: 2.0.0
Reporter: Xingjun Hao


Insert stage recover processing of the partition table throw exception “the 
unexpected 0 segment found”



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

2020-01-19 Thread GitBox

CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon 
SparkExtensions
URL: https://github.com/apache/carbondata/pull/3574#issuecomment-575978216
 
 
   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/1688/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

2020-01-19 Thread GitBox

kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided 
listing of table dir in refresh command
URL: https://github.com/apache/carbondata/pull/3581#discussion_r368274021
 
 

 ##
 File path: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/RefreshCarbonTableCommand.scala
 ##
 @@ -63,12 +63,20 @@ case class RefreshCarbonTableCommand(
 // then do the below steps
 // 2.2.1 validate that all the aggregate tables are copied at the store 
location.
 // 2.2.2 Register the aggregate tables
-val tablePath = CarbonEnv.getTablePath(databaseNameOp, 
tableName.toLowerCase)(sparkSession)
-val identifier = AbsoluteTableIdentifier.from(tablePath, databaseName, 
tableName.toLowerCase)
 // 2.1 check if the table already register with hive then ignore and 
continue with the next
 // schema
-if (!sparkSession.sessionState.catalog.listTables(databaseName)
-  .exists(_.table.equalsIgnoreCase(tableName))) {
+val provider = try {
+  sparkSession.sessionState.catalog
+.getTableMetadata(TableIdentifier(tableName, databaseNameOp)).provider
+} catch {
+  case _: NoSuchTableException =>
+None
+}
+if (provider.isEmpty ||
+provider.get.equalsIgnoreCase("org.apache.spark.sql.CarbonSource") ||
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

[GitHub] [carbondata] kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

[GitHub] [carbondata] kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3538: [WIP] Separate Insert and load to later optimize insert.

[jira] [Resolved] (CARBONDATA-3503) Adapt to SparkSessionExtensions

[GitHub] [carbondata] jackylk commented on issue #3479: [CARBONDATA-3271] Integrating deep learning framework TensorFlow

[GitHub] [carbondata] asfgit closed pull request #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

[GitHub] [carbondata] jackylk commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] QiangCai commented on a change in pull request #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3584: [WIP] Support SegmentLevel MinMax for better Pruning and less driver memory usage for cache

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3550: [CARBONDATA-3605] Remove global dictionary in query

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

[GitHub] [carbondata] xuchuanyin commented on a change in pull request #3576: [CARBONDATA-3514] Support spark 2.4 integration

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3583: [WIP] Support CarbonOutputFormat in Hive

[jira] [Resolved] (CARBONDATA-3645) BadRecords are inserted as NULL when column is of complex data type and BAD_RECORDS_ACTION is IGNORE

[jira] [Created] (CARBONDATA-3667) Insert stage recover processing of the partition table throw exception “the unexpected 0 segment found”

[GitHub] [carbondata] CarbonDataQA1 commented on issue #3574: [CARBONDATA-3503] Optimize Carbon SparkExtensions

[GitHub] [carbondata] kunal642 commented on a change in pull request #3581: [CARBONDATA-3666] Avoided listing of table dir in refresh command

28 matches

Site Navigation

Mail list logo

Footer information