[jira] [Commented] (SPARK-15348) Hive ACID
[ https://issues.apache.org/jira/browse/SPARK-15348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475265#comment-16475265 ] Harleen Singh Mann commented on SPARK-15348: Agreed with Arvind. This means either I dont use hive-ACID or break my pipeline out from spark and use hql > Hive ACID > - > > Key: SPARK-15348 > URL: https://issues.apache.org/jira/browse/SPARK-15348 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 1.6.3, 2.0.2, 2.1.2, 2.2.0, 2.3.0 >Reporter: Ran Haim >Priority: Major > > Spark does not support any feature of hive's transnational tables, > you cannot use spark to delete/update a table and it also has problems > reading the aggregated data when no compaction was done. > Also it seems that compaction is not supported - alter table ... partition > COMPACT 'major' -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360257#comment-16360257 ] Harleen Singh Mann commented on SPARK-23370: # Yes, querying the table would mean non-trivial performance impact # It works for all tables that the jdbc user has access to. For more information refer to [https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_2094.htm] This is very similar to the INFORMATION_SCHEMA.COLUMNS table in MySQL. > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Minor > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}{color:#f6c342}I can implement the changes, but require some > inputs on the approach from the gatekeepers here{color}.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23372) Writing empty struct in parquet fails during execution. It should fail earlier during analysis.
[ https://issues.apache.org/jira/browse/SPARK-23372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360252#comment-16360252 ] Harleen Singh Mann commented on SPARK-23372: Perhaps a dumb question: Why are you edition [OrcUtils.scala|https://github.com/apache/spark/pull/20579/files#diff-3fb8426b690ab771c4f67f9cad336498] ? > Writing empty struct in parquet fails during execution. It should fail > earlier during analysis. > --- > > Key: SPARK-23372 > URL: https://issues.apache.org/jira/browse/SPARK-23372 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Dilip Biswal >Priority: Minor > > *Running* > spark.emptyDataFrame.write.format("parquet").mode("overwrite").save(path) > *Results in* > {code:java} > org.apache.parquet.schema.InvalidSchemaException: Cannot write a schema with > an empty group: message spark_schema { > } > at org.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:27) > at org.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:37) > at org.apache.parquet.schema.MessageType.accept(MessageType.java:58) > at org.apache.parquet.schema.TypeUtil.checkValidWriteSchema(TypeUtil.java:23) > at > org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:225) > at > org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342) > at > org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetOutputWriter.scala:37) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:376) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:387) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:278) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:276) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:281) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:206) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:205) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread. > {code} > We should detect this earlier in the processing and raise the error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359788#comment-16359788 ] Harleen Singh Mann commented on SPARK-23370: This goes as far as I understand: * JDBC driver: Once we create the result set object using the jdbc driver, it will contain all the actual data as well as the metadata for the concerned DB table. * Query additional table (all_tab_rows): This would entail creating another result set that will capture the metadata for the concerned DB table as data (rows). Overhead: ** Connection: None. Since it will use pooling ** Retrieving result: Low impact. Since we will push down the predicate to the DB to filter data only for the concerned table I believe that "all_tab_rows" table should be queried on the driver and broadcast to the executors. Does this make sense? Can we get some inputs from someone else as well? > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Minor > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}{color:#f6c342}I can implement the changes, but require some > inputs on the approach from the gatekeepers here{color}.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23372) Writing empty struct in parquet fails during execution. It should fail earlier during analysis.
[ https://issues.apache.org/jira/browse/SPARK-23372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359202#comment-16359202 ] Harleen Singh Mann commented on SPARK-23372: [~dkbiswal] How will it throw the error during compile time? with ref to your statement: _"We should detect this earlier and failed during compilation of the query."_ I mean the use of "compilation" in the sentence is probably incorrect. I will suggest changing it to "during preparing/executing the query". > Writing empty struct in parquet fails during execution. It should fail > earlier during analysis. > --- > > Key: SPARK-23372 > URL: https://issues.apache.org/jira/browse/SPARK-23372 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.3.0 >Reporter: Dilip Biswal >Priority: Minor > > *Running* > spark.emptyDataFrame.write.format("parquet").mode("overwrite").save(path) > *Results in* > {code:java} > org.apache.parquet.schema.InvalidSchemaException: Cannot write a schema with > an empty group: message spark_schema { > } > at org.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:27) > at org.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:37) > at org.apache.parquet.schema.MessageType.accept(MessageType.java:58) > at org.apache.parquet.schema.TypeUtil.checkValidWriteSchema(TypeUtil.java:23) > at > org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:225) > at > org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342) > at > org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetOutputWriter.scala:37) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:376) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:387) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:278) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:276) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:281) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:206) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:205) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread. > {code} > We should detect this earlier and failed during compilation of the query. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359200#comment-16359200 ] Harleen Singh Mann commented on SPARK-23370: [~srowen] Yes should be able to implement in the Oracle JDBC dialect. I want to start working on it once we agree it adds value. Do you mean overhead for Spark? Or for the Oracle DB? Or for the developer? haha > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Minor > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}{color:#f6c342}I can implement the changes, but require some > inputs on the approach from the gatekeepers here{color}.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358495#comment-16358495 ] Harleen Singh Mann commented on SPARK-23370: [~q79969786] your suggestion would work but only if one knows in advance that there exists a column in Oracle DB of type Numeric and created using alter table statement. This information is seldom available to developers. [~srowen] True, it is an Oracle issue. If everyone agrees that Spark has nothing to do with it we may close this issue as is. However, I feel there may be merit in evaluating the way spark is fetching schema information from jdbc - i.e. resultSet.getMetaData.getColumnType VS from all_tabs_columns Thanks. > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Minor > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}{color:#f6c342}I can implement the changes, but require some > inputs on the approach from the gatekeepers here{color}.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23372) Writing empty struct in parquet fails during execution. It should fail earlier during analysis.
[ https://issues.apache.org/jira/browse/SPARK-23372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358294#comment-16358294 ] Harleen Singh Mann commented on SPARK-23372: what is your proposal on fixing this? > Writing empty struct in parquet fails during execution. It should fail > earlier during analysis. > --- > > Key: SPARK-23372 > URL: https://issues.apache.org/jira/browse/SPARK-23372 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.3.0 >Reporter: Dilip Biswal >Priority: Minor > > *Running* > spark.emptyDataFrame.write.format("parquet").mode("overwrite").save(path) > *Results in* > {code:java} > org.apache.parquet.schema.InvalidSchemaException: Cannot write a schema with > an empty group: message spark_schema { > } > at org.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:27) > at org.apache.parquet.schema.TypeUtil$1.visit(TypeUtil.java:37) > at org.apache.parquet.schema.MessageType.accept(MessageType.java:58) > at org.apache.parquet.schema.TypeUtil.checkValidWriteSchema(TypeUtil.java:23) > at > org.apache.parquet.hadoop.ParquetFileWriter.(ParquetFileWriter.java:225) > at > org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342) > at > org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.(ParquetOutputWriter.scala:37) > at > org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:376) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:387) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:278) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:276) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:281) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:206) > at > org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:205) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread. > {code} > We should detect this earlier and failed during compilation of the query. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harleen Singh Mann updated SPARK-23370: --- Shepherd: Sean Owen (was: Xiangrui Meng) > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Major > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}{color:#f6c342}I can implement the changes, but require some > inputs on the approach from the gatekeepers here{color}.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harleen Singh Mann updated SPARK-23370: --- Shepherd: Xiangrui Meng > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Major > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}{color:#f6c342}I can implement the changes, but require some > inputs on the approach from the gatekeepers here{color}.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harleen Singh Mann updated SPARK-23370: --- Description: Currently, on jdbc read spark obtains the schema of a table from using {color:#654982} resultSet.getMetaData.getColumnType{color} This works 99.99% of the times except when the column of Number type is added on an Oracle table using the alter statement. This is essentially an Oracle DB + JDBC bug that has been documented on Oracle KB and patches exist. [oracle KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] {color:#ff}As a result of the above mentioned issue, Spark receives a size of 0 for the field and defaults the field type to be BigDecimal(30,10) instead of what it actually should be. This is done in OracleDialect.scala. This may cause issues in the downstream application where relevant information may be missed to the changed precision and scale.{color} _The versions that are affected are:_ _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ _[Release: 11.1 to 11.2]_ +Proposed approach:+ There is another way of fetching the schema information in Oracle: Which is through the all_tab_columns table. If we use this table to fetch the precision and scale of Number time, the above issue is mitigated. {color:#14892c}{color:#f6c342}I can implement the changes, but require some inputs on the approach from the gatekeepers here{color}.{color} {color:#14892c}PS. This is also my first Jira issue and my first fork for Spark, so I will need some guidance along the way. (yes, I am a newbee to this) Thanks...{color} was: Currently, on jdbc read spark obtains the schema of a table from using {color:#654982} resultSet.getMetaData.getColumnType{color} This works 99.99% of the times except when the column of Number type is added on an Oracle table using the alter statement. This is essentially an Oracle DB + JDBC bug that has been documented on Oracle KB and patches exist. [oracle KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] {color:#FF}As a result of the above mentioned issue, Spark receives a size of 0 for the field and defaults the field type to be BigDecimal(30,10) instead of what it actually should be. This is done in OracleDialect.scala. This may cause issues in the downstream application where relevant information may be missed to the changed precision and scale.{color} _The versions that are affected are:_ _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ _[Release: 11.1 to 11.2]_ +Proposed approach:+ There is another way of fetching the schema information in Oracle: Which is through the all_tab_columns table. If we use this table to fetch the precision and scale of Number time, the above issue is mitigated. {color:#14892c}I can implement the changes, but require some inputs on the approach from the gatekeepers here.{color} {color:#14892c}PS. This is also my first Jira issue and my first fork for Spark, so I will need some guidance along the way. (yes, I am a newbee to this) Thanks...{color} > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Major > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#ff}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 1
[jira] [Updated] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harleen Singh Mann updated SPARK-23370: --- Summary: Spark receives a size of 0 for an Oracle Number field and defaults the field type to be BigDecimal(30,10) instead of the actual precision and scale (was: Spark receives a size of 0 for an Oracle Number field defaults the field type to be BigDecimal(30,10)) > Spark receives a size of 0 for an Oracle Number field and defaults the field > type to be BigDecimal(30,10) instead of the actual precision and scale > --- > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Major > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#FF}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}I can implement the changes, but require some inputs on the > approach from the gatekeepers here.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field defaults the field type to be BigDecimal(30,10)
[ https://issues.apache.org/jira/browse/SPARK-23370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harleen Singh Mann updated SPARK-23370: --- Attachment: Oracle KB Document 1266785.pdf > Spark receives a size of 0 for an Oracle Number field defaults the field type > to be BigDecimal(30,10) > - > > Key: SPARK-23370 > URL: https://issues.apache.org/jira/browse/SPARK-23370 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.1 > Environment: Spark 2.2 > Oracle 11g > JDBC ojdbc6.jar >Reporter: Harleen Singh Mann >Priority: Major > Attachments: Oracle KB Document 1266785.pdf > > > Currently, on jdbc read spark obtains the schema of a table from using > {color:#654982} resultSet.getMetaData.getColumnType{color} > This works 99.99% of the times except when the column of Number type is added > on an Oracle table using the alter statement. This is essentially an Oracle > DB + JDBC bug that has been documented on Oracle KB and patches exist. > [oracle > KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] > {color:#FF}As a result of the above mentioned issue, Spark receives a > size of 0 for the field and defaults the field type to be BigDecimal(30,10) > instead of what it actually should be. This is done in OracleDialect.scala. > This may cause issues in the downstream application where relevant > information may be missed to the changed precision and scale.{color} > _The versions that are affected are:_ > _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ > _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ > _[Release: 11.1 to 11.2]_ > +Proposed approach:+ > There is another way of fetching the schema information in Oracle: Which is > through the all_tab_columns table. If we use this table to fetch the > precision and scale of Number time, the above issue is mitigated. > > {color:#14892c}I can implement the changes, but require some inputs on the > approach from the gatekeepers here.{color} > {color:#14892c}PS. This is also my first Jira issue and my first fork for > Spark, so I will need some guidance along the way. (yes, I am a newbee to > this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23370) Spark receives a size of 0 for an Oracle Number field defaults the field type to be BigDecimal(30,10)
Harleen Singh Mann created SPARK-23370: -- Summary: Spark receives a size of 0 for an Oracle Number field defaults the field type to be BigDecimal(30,10) Key: SPARK-23370 URL: https://issues.apache.org/jira/browse/SPARK-23370 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.1 Environment: Spark 2.2 Oracle 11g JDBC ojdbc6.jar Reporter: Harleen Singh Mann Currently, on jdbc read spark obtains the schema of a table from using {color:#654982} resultSet.getMetaData.getColumnType{color} This works 99.99% of the times except when the column of Number type is added on an Oracle table using the alter statement. This is essentially an Oracle DB + JDBC bug that has been documented on Oracle KB and patches exist. [oracle KB|https://support.oracle.com/knowledge/Oracle%20Database%20Products/1266785_1.html] {color:#FF}As a result of the above mentioned issue, Spark receives a size of 0 for the field and defaults the field type to be BigDecimal(30,10) instead of what it actually should be. This is done in OracleDialect.scala. This may cause issues in the downstream application where relevant information may be missed to the changed precision and scale.{color} _The versions that are affected are:_ _JDBC - Version: 11.2.0.1 and later [Release: 11.2 and later ]_ _Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1_ _[Release: 11.1 to 11.2]_ +Proposed approach:+ There is another way of fetching the schema information in Oracle: Which is through the all_tab_columns table. If we use this table to fetch the precision and scale of Number time, the above issue is mitigated. {color:#14892c}I can implement the changes, but require some inputs on the approach from the gatekeepers here.{color} {color:#14892c}PS. This is also my first Jira issue and my first fork for Spark, so I will need some guidance along the way. (yes, I am a newbee to this) Thanks...{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org