[ https://issues.apache.org/jira/browse/SPARK-19230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823644#comment-15823644 ]
Ohad Raviv commented on SPARK-19230: ------------------------------------ looks like it happens since SPARK-13827 - it just caused the attribute names get much longer. As a not-so-bullet-proof work around I am now doing is to change the attribute normalization from "gen_attr_" -> "ga_" in: {quote} private def normalizedName(n: NamedExpression): String = synchronized \{ "gen_attr_" + exprIdMap.getOrElseUpdate(n.exprId.id, nextGenAttrId.getAndIncrement()) } {quote} -> {quote} private def normalizedName(n: NamedExpression): String = synchronized \{ "ga_" + exprIdMap.getOrElseUpdate(n.exprId.id, nextGenAttrId.getAndIncrement()) } {quote} I guess a better solution will be to not always invoke the Canonicalizer's NormalizedAttribute rule, just when there is an ambiguity. > View creation in Derby gets SQLDataException because definition gets very big > ----------------------------------------------------------------------------- > > Key: SPARK-19230 > URL: https://issues.apache.org/jira/browse/SPARK-19230 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.0 > Reporter: Ohad Raviv > > somewhat related to SPARK-6024. > In our tests mockups we have a process that creates a pretty big table > definition: > {quote} > create table t1 ( > field_name_1 string, > field_name_2 string, > field_name_3 string, > . > . > . > field_name_1000 string > ) > {quote} > which succeeds. But then we add some calculated fields on top of it with a > view: > {quote} > create view v1 as > select *, > some_udf(field_name_1) as field_calc1, > some_udf(field_name_2) as field_calc2, > . > . > some_udf(field_name_10) as field_calc10 > from t1 > {quote} > And we get this exception: > {quote} > java.sql.SQLDataException: A truncation error was encountered trying to > shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `field_name_1`, `gen_attr_1` AS > `field_name_2&' to length 32700. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown > Source) > at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown > Source) > at > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown > Source) > at > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown > Source) > at > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeLargeUpdate(Unknown > Source) > at > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown > Source) > at > com.jolbox.bonecp.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:205) > at > org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeUpdate(ParamLoggingPreparedStatement.java:399) > at > org.datanucleus.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:439) > at > org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:410) > at > org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertTable(RDBMSPersistenceHandler.java:167) > at > org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:143) > at > org.datanucleus.state.JDOStateManager.internalMakePersistent(JDOStateManager.java:3784) > at > org.datanucleus.state.JDOStateManager.makePersistent(JDOStateManager.java:3760) > at > org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2219) > at > org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065) > at > org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913) > at > org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217) > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727) > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752) > at > org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:814) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) > at com.sun.proxy.$Proxy17.createTable(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1416) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1449) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy19.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.create_table_with_environment_context(HiveMetaStoreClient.java:2050) > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.create_table_with_environment_context(SessionHiveMetaStoreClient.java:97) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:669) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:657) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) > at com.sun.proxy.$Proxy20.createTable(Unknown Source) > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:714) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply$mcV$sp(HiveClientImpl.scala:425) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:425) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$createTable$1.apply(HiveClientImpl.scala:425) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:283) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:230) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:229) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:272) > at > org.apache.spark.sql.hive.client.HiveClientImpl.createTable(HiveClientImpl.scala:424) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:203) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:191) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:191) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:95) > at > org.apache.spark.sql.hive.HiveExternalCatalog.createTable(HiveExternalCatalog.scala:191) > at > org.apache.spark.sql.catalyst.catalog.SessionCatalog.createTable(SessionCatalog.scala:248) > at > org.apache.spark.sql.execution.command.CreateViewCommand.run(views.scala:176) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132) > at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699) > at > com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.SparkHcatUtils.execute(SparkHcatUtils.java:24) > at > com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.executeHcatSql(HcatUtils.java:48) > at > com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.executeHcatSql(HcatUtils.java:38) > at > com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.handleHcatTableView(HcatUtils.java:110) > at > com.paypal.risk.ars.bigdata.execution_fw.hadoop.hive.HcatUtils.handleHcatTableView(HcatUtils.java:92) > at > com.paypal.risk.ars.bigdata.test.FlowTestUtils.prepareHcatTable(FlowTestUtils.java:77) > at > com.paypal.risk.ars.bigdata.test.TestPigBase.innerSetup(TestPigBase.java:104) > at > com.paypal.risk.ars.bigdata.hadoop.spark.TestSparkBase.innerSetup(TestSparkBase.scala:65) > at > com.paypal.risk.ars.bigdata.test.AbstractDataDrivenUnitTest.setup(AbstractDataDrivenUnitTest.java:73) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237) > at > com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147) > Caused by: java.sql.SQLException: A truncation error was encountered trying > to shrink LONG VARCHAR 'SELECT `gen_attr_0` AS `field_name_1`, `gen_attr_1` > AS `field_name_2&' to length 32700. > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) > at > org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown > Source) > ... 118 more > Caused by: ERROR 22001: A truncation error was encountered trying to shrink > LONG VARCHAR 'SELECT `gen_attr_0` AS `cust_id`, `gen_attr_1` AS `txn_lmt_a&' > to length 32700. > at org.apache.derby.iapi.error.StandardException.newException(Unknown > Source) > at org.apache.derby.iapi.types.SQLLongvarchar.normalize(Unknown Source) > at org.apache.derby.iapi.types.SQLVarchar.normalize(Unknown Source) > at org.apache.derby.iapi.types.DataTypeDescriptor.normalize(Unknown > Source) > at > org.apache.derby.impl.sql.execute.NormalizeResultSet.normalizeColumn(Unknown > Source) > at > org.apache.derby.impl.sql.execute.NormalizeResultSet.normalizeRow(Unknown > Source) > at > org.apache.derby.impl.sql.execute.NormalizeResultSet.getNextRowCore(Unknown > Source) > at > org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(Unknown > Source) > at org.apache.derby.impl.sql.execute.InsertResultSet.open(Unknown > Source) > at > org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown Source) > at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown > Source) > ... 112 more > {quote} > trying to debug, it looks like the view definition gets multiplied about 3 > times: > {quote} > SELECT `gen_attr_0` as `field_name_1`,...`gen_attr_1000` as `field_name_1000` > FROM ( > SELECT `gen_attr_0`, `gen_attr_1`,..,`gen_attr_1000` FROM ( > SELECT `field_name_1` AS `gen_attr_0`,...,`field_name_1000` AS > `gen_attr_1000`) as gen_subquery_0 ) as t1 > {quote} > Is there any known work-around? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org