[ 
https://issues.apache.org/jira/browse/SPARK-49529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17879956#comment-17879956
 ] 

Bruce Robbins commented on SPARK-49529:
---------------------------------------

This actually matches Java 17's behavior.

Try this code in the REPL:
{noformat}
import java.time._
import java.time.temporal.ChronoUnit._

val utcZid = ZoneId.of("UTC")
val istZid = ZoneId.of("Asia/Calcutta")

val utcLdt = ZonedDateTime.of(LocalDateTime.of(1, 1, 1, 0, 0, 0, 0), utcZid)

val istLdt = utcLdt.withZoneSameInstant(istZid)
println(istLdt)
{noformat}
The code prints the following:
{noformat}
0001-01-01T05:53:28+05:53:28[Asia/Calcutta]
{noformat}
Note that Java thinks the timezone offset is +05:53:28 for that period of time 
(maybe because 0001-01-01 was before some official codification of the 
timezones in India?).


> Incorrect results from from_utc_timestamp function
> --------------------------------------------------
>
>                 Key: SPARK-49529
>                 URL: https://issues.apache.org/jira/browse/SPARK-49529
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0, 3.4.1, 4.0.0, 3.5.2, 3.4.4, 3.5.3
>            Reporter: Ankit Prakash Gupta
>            Priority: Major
>
> The values returned as output from_utc_timestamp are erratic and are not 
> consistent in case of values before year 1850
>  
> {code:java}
> ❯ JAVA_HOME=/Library/Java/JavaVirtualMachines/openjdk-17.jdk/Contents/Home/ 
> bin/spark-shell --master local --conf spark.sql.session.timeZone=UTC
> WARNING: Using incubator modules: jdk.incubator.vector
> Using Spark's default log4j profile: 
> org/apache/spark/log4j2-defaults.properties
> {"ts":"2024-09-06T02:42:45.333Z","level":"WARN","msg":"Your hostname, 
> RINMAC2772, resolves to a loopback address: 127.0.0.1; using 192.168.28.3 
> instead (on interface 
> en0)","context":{"host":"RINMAC2772","host_port":"127.0.0.1","host_port2":"192.168.28.3","network_if":"en0"},"logger":"Utils"}
> {"ts":"2024-09-06T02:42:45.336Z","level":"WARN","msg":"Set SPARK_LOCAL_IP if 
> you need to bind to another address","logger":"Utils"}
> {"ts":"2024-09-06T02:42:45.536Z","level":"INFO","msg":"Registering signal 
> handler for INT","logger":"SignalUtils"}
> Welcome to
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 4.0.0-preview1
>       /_/Using Scala version 2.13.14 (OpenJDK 64-Bit Server VM, Java 17.0.11)
> Type in expressions to have them evaluated.
> Type :help for more information.
> {"ts":"2024-09-06T02:42:48.551Z","level":"INFO","msg":"Found configuration 
> file null","logger":"HiveConf"}
> {"ts":"2024-09-06T02:42:48.597Z","level":"INFO","msg":"Running Spark version 
> 4.0.0-preview1","logger":"SparkContext"}
> {"ts":"2024-09-06T02:42:48.598Z","level":"INFO","msg":"OS info Mac OS X, 
> 14.5, aarch64","logger":"SparkContext"}
> {"ts":"2024-09-06T02:42:48.598Z","level":"INFO","msg":"Java version 
> 17.0.11","logger":"SparkContext"}
> {"ts":"2024-09-06T02:42:48.647Z","level":"WARN","msg":"Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable","logger":"NativeCodeLoader"}
> {"ts":"2024-09-06T02:42:48.683Z","level":"INFO","msg":"==============================================================","logger":"ResourceUtils"}
> {"ts":"2024-09-06T02:42:48.683Z","level":"INFO","msg":"No custom resources 
> configured for spark.driver.","logger":"ResourceUtils"}
> {"ts":"2024-09-06T02:42:48.683Z","level":"INFO","msg":"==============================================================","logger":"ResourceUtils"}
> {"ts":"2024-09-06T02:42:48.684Z","level":"INFO","msg":"Submitted application: 
> Spark shell","logger":"SparkContext"}
> {"ts":"2024-09-06T02:42:48.697Z","level":"INFO","msg":"Default 
> ResourceProfile created, executor resources: Map(cores -> name: cores, 
> amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: 
> , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task 
> resources: Map(cpus -> name: cpus, amount: 1.0)","logger":"ResourceProfile"}
> {"ts":"2024-09-06T02:42:48.698Z","level":"INFO","msg":"Limiting resource is 
> cpu","logger":"ResourceProfile"}
> {"ts":"2024-09-06T02:42:48.698Z","level":"INFO","msg":"Added ResourceProfile 
> id: 0","logger":"ResourceProfileManager"}
> {"ts":"2024-09-06T02:42:48.724Z","level":"INFO","msg":"Changing view acls to: 
> ankit.gupta","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:48.724Z","level":"INFO","msg":"Changing modify acls 
> to: ankit.gupta","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:48.725Z","level":"INFO","msg":"Changing view acls 
> groups to: ","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:48.725Z","level":"INFO","msg":"Changing modify acls 
> groups to: ","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:48.727Z","level":"INFO","msg":"SecurityManager: 
> authentication disabled; ui acls disabled; users with view permissions: 
> ankit.gupta; groups with view permissions: EMPTY; users with modify 
> permissions: ankit.gupta; groups with modify permissions: EMPTY; RPC SSL 
> disabled","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:48.845Z","level":"INFO","msg":"Successfully started 
> service 'sparkDriver' on port 58034.","logger":"Utils"}
> {"ts":"2024-09-06T02:42:48.858Z","level":"INFO","msg":"Registering 
> MapOutputTracker","logger":"SparkEnv"}
> {"ts":"2024-09-06T02:42:48.862Z","level":"INFO","msg":"Registering 
> BlockManagerMaster","logger":"SparkEnv"}
> {"ts":"2024-09-06T02:42:48.870Z","level":"INFO","msg":"Using 
> org.apache.spark.storage.DefaultTopologyMapper for getting topology 
> information","logger":"BlockManagerMasterEndpoint"}
> {"ts":"2024-09-06T02:42:48.871Z","level":"INFO","msg":"BlockManagerMasterEndpoint
>  up","logger":"BlockManagerMasterEndpoint"}
> {"ts":"2024-09-06T02:42:48.872Z","level":"INFO","msg":"Registering 
> BlockManagerMasterHeartbeat","logger":"SparkEnv"}
> {"ts":"2024-09-06T02:42:48.888Z","level":"INFO","msg":"Created local 
> directory at 
> /private/var/folders/8h/6blq52c15c303rydrrb15tkj20sffw/T/blockmgr-27c3b3f9-785e-45e2-a417-8f6bd0ff3a44","logger":"DiskBlockManager"}
> {"ts":"2024-09-06T02:42:48.899Z","level":"INFO","msg":"Registering 
> OutputCommitCoordinator","logger":"SparkEnv"}
> {"ts":"2024-09-06T02:42:48.967Z","level":"INFO","msg":"Start Jetty 
> 0.0.0.0:4040 for SparkUI","logger":"JettyUtils"}
> {"ts":"2024-09-06T02:42:48.999Z","level":"INFO","msg":"Successfully started 
> service 'SparkUI' on port 4040.","logger":"Utils"}
> {"ts":"2024-09-06T02:42:49.026Z","level":"INFO","msg":"Changing view acls to: 
> ankit.gupta","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:49.026Z","level":"INFO","msg":"Changing modify acls 
> to: ankit.gupta","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:49.026Z","level":"INFO","msg":"Changing view acls 
> groups to: ","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:49.026Z","level":"INFO","msg":"Changing modify acls 
> groups to: ","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:49.026Z","level":"INFO","msg":"SecurityManager: 
> authentication disabled; ui acls disabled; users with view permissions: 
> ankit.gupta; groups with view permissions: EMPTY; users with modify 
> permissions: ankit.gupta; groups with modify permissions: EMPTY; RPC SSL 
> disabled","logger":"SecurityManager"}
> {"ts":"2024-09-06T02:42:49.063Z","level":"INFO","msg":"Starting executor ID 
> driver on host 192.168.28.3","logger":"Executor"}
> {"ts":"2024-09-06T02:42:49.063Z","level":"INFO","msg":"OS info Mac OS X, 
> 14.5, aarch64","logger":"Executor"}
> {"ts":"2024-09-06T02:42:49.064Z","level":"INFO","msg":"Java version 
> 17.0.11","logger":"Executor"}
> {"ts":"2024-09-06T02:42:49.067Z","level":"INFO","msg":"Starting executor with 
> user classpath (userClassPathFirst = false): ''","logger":"Executor"}
> {"ts":"2024-09-06T02:42:49.067Z","level":"INFO","msg":"Using REPL class URI: 
> spark://192.168.28.3:58034/classes","logger":"Executor"}
> {"ts":"2024-09-06T02:42:49.070Z","level":"INFO","msg":"Created or updated 
> repl class loader org.apache.spark.executor.ExecutorClassLoader@41bc501b for 
> default.","logger":"Executor"}
> {"ts":"2024-09-06T02:42:49.079Z","level":"INFO","msg":"Successfully started 
> service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 
> 58036.","logger":"Utils"}
> {"ts":"2024-09-06T02:42:49.080Z","level":"INFO","msg":"Server created on 
> 192.168.28.3:58036","logger":"NettyBlockTransferService"}
> {"ts":"2024-09-06T02:42:49.081Z","level":"INFO","msg":"Using 
> org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
> policy","logger":"BlockManager"}
> {"ts":"2024-09-06T02:42:49.086Z","level":"INFO","msg":"Registering 
> BlockManager BlockManagerId(driver, 192.168.28.3, 58036, 
> None)","logger":"BlockManagerMaster"}
> {"ts":"2024-09-06T02:42:49.089Z","level":"INFO","msg":"Registering block 
> manager 192.168.28.3:58036 with 434.4 MiB RAM, BlockManagerId(driver, 
> 192.168.28.3, 58036, None)","logger":"BlockManagerMasterEndpoint"}
> {"ts":"2024-09-06T02:42:49.090Z","level":"INFO","msg":"Registered 
> BlockManager BlockManagerId(driver, 192.168.28.3, 58036, 
> None)","logger":"BlockManagerMaster"}
> {"ts":"2024-09-06T02:42:49.091Z","level":"INFO","msg":"Initialized 
> BlockManager: BlockManagerId(driver, 192.168.28.3, 58036, 
> None)","logger":"BlockManager"}
> Spark context Web UI available at http://192.168.28.3:4040
> Spark context available as 'sc' (master = local, app id = 
> local-1725590569043).
> Spark session available as 'spark'.scala> :paste
> // Entering paste mode (ctrl-D to 
> finish)java.util.TimeZone.setDefault(java.util.TimeZone.getTimeZone("UTC"))
> val df = Seq(java.sql.Timestamp.valueOf("0001-01-01 00:00:00"), 
> java.sql.Timestamp.valueOf("1900-01-01 00:00:00"), 
> java.sql.Timestamp.valueOf("1799-12-31 00:00:00"), 
> java.sql.Timestamp.valueOf("1850-12-31 00:00:00"), new 
> java.sql.Timestamp(0)).toDF("ts")
> df.withColumn("ts_trans", from_utc_timestamp($"ts", "JST")).show
> // Exiting paste mode... now interpreting.
> warning: 1 deprecation (since 2.13.3); for details, enable `:setting 
> -deprecation` or `:replay -deprecation`
> {"ts":"2024-09-06T02:42:55.091Z","level":"INFO","msg":"Setting 
> hive.metastore.warehouse.dir ('null') to the value of 
> spark.sql.warehouse.dir.","logger":"SharedState"}
> {"ts":"2024-09-06T02:42:55.096Z","level":"INFO","msg":"Warehouse path is 
> 'file:/Users/ankit.gupta/Workspace/softwares/spark-4.0.0-preview1-bin-hadoop3/spark-warehouse'.","logger":"SharedState"}
> {"ts":"2024-09-06T02:42:55.573Z","level":"INFO","msg":"Code generated in 
> 110.8475 ms","logger":"CodeGenerator"}
> {"ts":"2024-09-06T02:42:56.179Z","level":"INFO","msg":"Code generated in 
> 5.344709 ms","logger":"CodeGenerator"}
> {"ts":"2024-09-06T02:42:56.196Z","level":"INFO","msg":"Code generated in 
> 7.283917 ms","logger":"CodeGenerator"}
> +-------------------+-------------------+
> |                 ts|           ts_trans|
> +-------------------+-------------------+
> |0001-01-01 00:00:00|0001-01-01 09:18:59|
> |1900-01-01 00:00:00|1900-01-01 09:00:00|
> |1799-12-31 00:00:00|1799-12-31 09:18:59|
> |1850-12-31 00:00:00|1850-12-31 09:18:59|
> |1970-01-01 00:00:00|1970-01-01 09:00:00|
> +-------------------+-------------------+val df: 
> org.apache.spark.sql.DataFrame = [ts: timestamp]
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to