hong-yangzhao opened a new issue, #1480:
URL: https://github.com/apache/sedona/issues/1480
## Expected behavior
Successfully read Chinese
## Actual behavior
Garbled code
## Steps to reproduce the problem
Partial code(Java):
```java
public static void main(String[] args) {
SparkSession sedona = SedonaContext.builder()
.master("local")
.appName("test")
.config("spark.serializer",
"org.apache.spark.serializer.KryoSerializer")
.config("spark.kryo.registrator",
SedonaVizKryoRegistrator.class.getName())
.config("spark.driver.extraJavaOptions",
"-Dsedona.global.charset=utf8")
.config("spark.executor.extraJavaOptions",
"-Dsedona.global.charset=utf8")
.getOrCreate();
SedonaContext.create(sedona);
SpatialRDD rdd =
ShapefileReader.readToGeometryRDD(JavaSparkContext.fromSparkContext(sedona.sparkContext()),
"F:\\test_file\\test_shp");
Dataset df = Adapter.toDf(rdd , sedona);
df.show(1);
}
```
I followed the example from the official website 'API Docs -> Sedona with
Apache Spark -> Vector data -> Constructor -> Read ESRI Shapefile -> SparkSQL
example' to read a shp file. This shp file is able to display Chinese
characters properly in GIS software, but when sedona reads the data, whether
it's printed to the console or written to a file, the Chinese part is displayed
as garbled text. Additionally, when I use 'printSchema()', I noticed that all
the 'Column' types are shown as string, whereas my shp file contains data types
other than just string.
Since all the 'Column' types are displayed as string, I found that the
decimal numbers have been transformed into numbers in scientific notation. Is
it because I performed the operation incorrectly, or is there another reason?
Please advise, thank you!
## Settings
Sedona version = ?
1.5.1
Apache Spark version = ?
3.5.0
Apache Flink version = ?
API type = Scala, Java, Python?
Java
Scala version = 2.11, 2.12, 2.13?
2.13
JRE version = 1.8, 1.11?
1.8
Python version = ?
Environment = Standalone, AWS EC2, EMR, Azure, Databricks?
Standalone,local
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org