[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r126469617 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -195,11 +195,43 @@ case class CreateTable(cm: TableModel, createDSTable: Boolean = true) extends Ru val fields = new Array[Field](cm.dimCols.size + cm.msrCols.size) cm.dimCols.foreach(f => fields(f.schemaOrdinal) = f) cm.msrCols.foreach(f => fields(f.schemaOrdinal) = f) - sparkSession.sql( -s"""CREATE TABLE $dbName.$tbName -|(${ fields.map(f => f.rawSchema).mkString(",") }) -|USING org.apache.spark.sql.CarbonSource""".stripMargin + -s""" OPTIONS (tableName "$tbName", dbName "$dbName", tablePath "$tablePath") """) + val useCompatibleSchema = sparkSession.sparkContext.conf + .getBoolean(CarbonCommonConstants.SPARK_SCHEMA_HIVE_COMPATIBILITY_ENABLE, false) + if (useCompatibleSchema) { +val tableIdentifier = TableIdentifier(tbName, Some(dbName)) +val tableSchema = CarbonEnv.getInstance(sparkSession).carbonMetastore + .lookupRelation(tableIdentifier)(sparkSession).schema.json +val schemaParts = AlterTableUtil.prepareSchemaJsonForAlterTable( + sparkSession.sparkContext.conf, tableSchema) + sparkSession.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client --- End diff -- @anubhav100 It is a bug when I merge all commmits into one commit. I will change it tomorrow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r126410819 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -195,11 +195,43 @@ case class CreateTable(cm: TableModel, createDSTable: Boolean = true) extends Ru val fields = new Array[Field](cm.dimCols.size + cm.msrCols.size) cm.dimCols.foreach(f => fields(f.schemaOrdinal) = f) cm.msrCols.foreach(f => fields(f.schemaOrdinal) = f) - sparkSession.sql( -s"""CREATE TABLE $dbName.$tbName -|(${ fields.map(f => f.rawSchema).mkString(",") }) -|USING org.apache.spark.sql.CarbonSource""".stripMargin + -s""" OPTIONS (tableName "$tbName", dbName "$dbName", tablePath "$tablePath") """) + val useCompatibleSchema = sparkSession.sparkContext.conf + .getBoolean(CarbonCommonConstants.SPARK_SCHEMA_HIVE_COMPATIBILITY_ENABLE, false) + if (useCompatibleSchema) { +val tableIdentifier = TableIdentifier(tbName, Some(dbName)) +val tableSchema = CarbonEnv.getInstance(sparkSession).carbonMetastore + .lookupRelation(tableIdentifier)(sparkSession).schema.json +val schemaParts = AlterTableUtil.prepareSchemaJsonForAlterTable( + sparkSession.sparkContext.conf, tableSchema) + sparkSession.sharedState.externalCatalog.asInstanceOf[HiveExternalCatalog].client --- End diff -- @cenyuhai change this code to sparkSession.sessionState.asInstanceOf[CarbonSessionState].metadataHive beause you can not alter a hive table this table does not exists in carbon metastore look below,i tried it and get a exception cala> carbon.sql("alter table idsr226617624713 add columns(name string) ") org.apache.carbondata.spark.exception.MalformedCarbonCommandException: Unsupported alter operation on hive table after changing the code it works fine --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r126336311 --- Diff: integration/hive/hive-guide.md --- @@ -41,16 +41,19 @@ mvn -DskipTests -Pspark-2.1 -Phadoop-2.7.2 clean package $HADOOP_HOME/bin/hadoop fs -put sample.csv /sample.csv ``` + +Please set spark.carbon.hive.schema.compatibility.enable=true in spark-defaults.conf * Start Spark shell by running the following command in the Spark directory + ``` -./bin/spark-shell --jars +./bin/spark-shell --jars --- End diff -- @cenyuhai carbon hive jar won't be required for running the spark shell i guess --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r125816477 --- Diff: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java --- @@ -269,9 +270,9 @@ public ICarbonLock getTableUpdateStatusLock() { * @throws Exception */ public String[] getDeleteDeltaFilePath(String blockFilePath) throws Exception { -int tableFactPathLength = CarbonStorePath -.getCarbonTablePath(absoluteTableIdentifier.getStorePath(), - absoluteTableIdentifier.getCarbonTableIdentifier()).getFactDir().length() + 1; +String factTableDir = +absoluteTableIdentifier.getCarbonTableIdentifier().getTableName() + File.separator + "Fact"; +int tableFactPathLength = blockFilePath.indexOf(factTableDir) + factTableDir.length() + 1; --- End diff -- Because when I create table, the location in table properties is 'hdfs:user/x/fact', but the fact dir is 'hdfs://cluster/user/x/fact', so getFactDir().length() + 1 is wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user cenyuhai commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r125815870 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommands.scala --- @@ -438,11 +457,24 @@ private[sql] case class AlterTableDataTypeChange( schemaEvolutionEntry.setRemoved(List(deletedColumnSchema).asJava) tableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history.get(0) .setTime_stamp(System.currentTimeMillis) + val sessionState = sparkSession.sessionState.asInstanceOf[CarbonSessionState] AlterTableUtil .updateSchemaInfo(carbonTable, schemaEvolutionEntry, - tableInfo)(sparkSession, - sparkSession.sessionState.asInstanceOf[CarbonSessionState]) + tableInfo)(sparkSession, sessionState) + val useCompatibleSchema = sparkSession.sparkContext.conf + .getBoolean(CarbonCommonConstants.SPARK_SCHEMA_HIVE_COMPATIBILITY_ENABLE, false) + if (useCompatibleSchema) { +val dataTypeInfo = alterTableDataTypeChangeModel.dataTypeInfo +val colSchema = if (dataTypeInfo.dataType == "decimal") { --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user bhavya411 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r125279885 --- Diff: core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java --- @@ -269,9 +270,9 @@ public ICarbonLock getTableUpdateStatusLock() { * @throws Exception */ public String[] getDeleteDeltaFilePath(String blockFilePath) throws Exception { -int tableFactPathLength = CarbonStorePath -.getCarbonTablePath(absoluteTableIdentifier.getStorePath(), - absoluteTableIdentifier.getCarbonTableIdentifier()).getFactDir().length() + 1; +String factTableDir = +absoluteTableIdentifier.getCarbonTableIdentifier().getTableName() + File.separator + "Fact"; +int tableFactPathLength = blockFilePath.indexOf(factTableDir) + factTableDir.length() + 1; --- End diff -- why is this change needed?? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user bhavya411 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r125282321 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/AlterTableCommands.scala --- @@ -438,11 +457,24 @@ private[sql] case class AlterTableDataTypeChange( schemaEvolutionEntry.setRemoved(List(deletedColumnSchema).asJava) tableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history.get(0) .setTime_stamp(System.currentTimeMillis) + val sessionState = sparkSession.sessionState.asInstanceOf[CarbonSessionState] AlterTableUtil .updateSchemaInfo(carbonTable, schemaEvolutionEntry, - tableInfo)(sparkSession, - sparkSession.sessionState.asInstanceOf[CarbonSessionState]) + tableInfo)(sparkSession, sessionState) + val useCompatibleSchema = sparkSession.sparkContext.conf + .getBoolean(CarbonCommonConstants.SPARK_SCHEMA_HIVE_COMPATIBILITY_ENABLE, false) + if (useCompatibleSchema) { +val dataTypeInfo = alterTableDataTypeChangeModel.dataTypeInfo +val colSchema = if (dataTypeInfo.dataType == "decimal") { --- End diff -- use constant for "decimal" --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r119525572 --- Diff: integration/hive/src/main/scala/org/apache/carbondata/hiveexample/HiveExample.scala --- @@ -54,15 +54,15 @@ object HiveExample { .getOrCreateCarbonSession( store, metaStore_Db) -val carbonHadoopJarPath = s"$rootPath/assembly/target/scala-2.11/carbondata_2.11-1.1" + - ".0-incubating-SNAPSHOT-shade-hadoop2.7.2.jar" +val carbonHadoopJarPath = s"$rootPath/assembly/target/scala-2.11/carbondata_2.11-1.2" + --- End diff -- same here --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
Github user anubhav100 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/984#discussion_r119342650 --- Diff: integration/hive/src/main/scala/org/apache/carbondata/hiveexample/HiveExample.scala --- @@ -54,15 +54,15 @@ object HiveExample { .getOrCreateCarbonSession( store, metaStore_Db) -val carbonHadoopJarPath = s"$rootPath/assembly/target/scala-2.11/carbondata_2.11-1.1" + --- End diff -- @cenyuhai better to remove this add jar step example wont require any jar to run it,it will create problem with every new release,i corrected this in this pr https://github.com/apache/carbondata/pull/979 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #984: [CARBONDATA-1008] Make Caron table schema comp...
GitHub user cenyuhai opened a pull request: https://github.com/apache/carbondata/pull/984 [CARBONDATA-1008] Make Caron table schema compatible with HIVE Make Caron table schema compatible with HIVE You can merge this pull request into a Git repository by running: $ git pull https://github.com/cenyuhai/incubator-carbondata CARBONDATA-1008 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/984.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #984 commit 64404855e809baa07b31726b7a9fbd24d4555a75 Author: cenyuhai <261810...@qq.com> Date: 2017-05-30T10:23:04Z make carbon schema compatible with hive commit 312b472358597633b91a84008bb46be389d9583a Author: cenyuhai <261810...@qq.com> Date: 2017-05-30T15:44:20Z sync alter table command to hive metastore --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---