Ohad Raviv created SPARK-19230: ---------------------------------- Summary: View creation in Derby gets SQLDataException because definition gets very big Key: SPARK-19230 URL: https://issues.apache.org/jira/browse/SPARK-19230 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.1.0 Reporter: Ohad Raviv
somewhat related to SPARK-6024. In our tests mockups we have a process that creates a pretty big table definition: {quote} create table t1 ( field_name_1 string, field_name_2 string, field_name_3 string, . . . field_name_1000 string ) {quote} which succeeds. But then we add some calculated fields on top of it with a view: {quote} create view v1 as select *, some_udf(field_name_1) as field_calc1, some_udf(field_name_2) as field_calc2, . . some_udf(field_name_10) as field_calc10 from t1 {quote} And we get this exception: {quote} java.sql.SQLDataException: A truncation error was encountered trying to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `field_name_1`, `gen_attr_1` AS `field_name_2&' to length 32700. at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown Source) at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown Source) at com.jolbox.bonecp.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:205) at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeUpdate(ParamLoggingPreparedStatement.java:399) at org.datanucleus.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:439) at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:410) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertTable(RDBMSPersistenceHandler.java:167) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:143) at org.datanucleus.state.JDOStateManager.internalMakePersistent(JDOStateManager.java:3784) at org.datanucleus.state.JDOStateManager.makePersistent(JDOStateManager.java:3760) at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2219) at org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065) at org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913) at org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217) at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727) at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752) at org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:814) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy17.createTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1416) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1449) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy19.create_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2050) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:97) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:669) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy20.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:425) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:425) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:425) at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:283) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:230) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:229) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:272) at org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:424) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:203) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:191) at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:191) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:95) at org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:191) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:248) at org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:176) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699) at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.SparkHcatUtils.execute(SparkHcatUtils.java:24) at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.executeHcatSql(HcatUtils.java:48) at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.executeHcatSql(HcatUtils.java:38) at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.handleHcatTableView(HcatUtils.java:110) at com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.handleHcatTableView(HcatUtils.java:92) at com.paypal.risk.ars.bigdata.test.FlowTestUtils.prepareHcatTable(FlowTestUtils.java:77) at com.paypal.risk.ars.bigdata.test.TestPigBase.innerSetup(TestPigBase.java:104) at com.paypal.risk.ars.bigdata.hadoop.spark.TestSparkBase.innerSetup(TestSparkBase.scala:65) at com.paypal.risk.ars.bigdata.test.AbstractDataDrivenUnitTest.setup(AbstractDataDrivenUnitTest.java:73) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147) Caused by: java.sql.SQLException: A truncation error was encountered trying to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `field_name_1`, `gen_attr_1` AS `field_name_2&' to length 32700. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source) ... 118 more Caused by: ERROR 22001: A truncation error was encountered trying to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `cust_id`, `gen_attr_1` AS `txn_lmt_a&' to length 32700. at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.iapi.types.SQLLongvarchar.normalize(Unknown Source) at org.apache.derby.iapi.types.SQLVarchar.normalize(Unknown Source) at org.apache.derby.iapi.types.DataTypeDescriptor.normalize(Unknown Source) at org.apache.derby.impl.sql.execute.NormalizeResultSet.normalizeColumn(Unknown Source) at org.apache.derby.impl.sql.execute.NormalizeResultSet.normalizeRow(Unknown Source) at org.apache.derby.impl.sql.execute.NormalizeResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.InsertResultSet.open(Unknown Source) at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown Source) ... 112 more {quote} trying to debug, it looks like the view definition gets multiplied about 3 times: {quote} SELECT `gen_attr_0` as `field_name_1`,...`gen_attr_1000` as `field_name_1000` FROM ( SELECT `gen_attr_0`, `gen_attr_1`,..,`gen_attr_1000` FROM ( SELECT `field_name_1` AS `gen_attr_0`,...,`field_name_1000` AS `gen_attr_1000`) as gen_subquery_0 ) as t1 {quote} Is there any known work-around? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org