merca opened a new issue, #1065:
URL: https://github.com/apache/sedona/issues/1065
## Expected behavior
`st_makeline(collect_list(st_astext(st_point(lon,lat))))` return a line
## Actual behavior
Error: `IllegalArgumentException: function ST_MakeLine takes at least 2
argument(s)`
## Steps to reproduce the problem
```python
# Dummy data
ais_data = [
('2023-01-01 00:00:01', 259322000,
'0101000020E6100000105839B4C8F62D40E9263108AC145140'),
('2023-01-01 00:00:07', 259322000,
'0101000020E61000008104C58F31F72D404D840D4FAF145140'),
('2023-01-01 00:00:13', 259322000,
'0101000020E61000008126C286A7F72D4062105839B4145140'),
('2023-01-01 00:00:19', 259322000,
'0101000020E6100000D5E76A2BF6F72D40C66D3480B7145140'),
('2023-01-01 00:00:26', 259322000,
'0101000020E6100000637FD93D79F82D40DBF97E6ABC145140')
]
# Create DataFrame with schema
from pyspark.sql.types import StructType, StructField, StringType,
IntegerType, TimestampType
schema = StructType([
StructField("date_time_utc_s", StringType(), True),
StructField("mmsi", IntegerType(), True),
StructField("geo_wkb", StringType(), True)
])
ais_df = spark.createDataFrame(ais_data,schema)
from pyspark.sql.functions import col
ais_df = ais_df.withColumn("date_time_utc",
col("date_time_utc_s").cast("timestamp")).drop("date_time_utc_s")
# Register DataFrame as temporary table
ais_df.createOrReplaceTempView("temp_ais")
```
I have tried converting between different variants
```sql
%sql
with ais as (
select
date_time_utc,
mmsi,
geo_wkb, -- so this is the binary, but stored as string
(0101000020E61000000ABABDA4317A1B405E4BC8073D734F40)
st_astext(st_geomfromwkb(geo_wkb)) as geo_wkt -- human readable
point string (POINT (6.86933 62.9003))
from temp_ais
order by mmsi,date_time_utc
),
all_geo as(
select
date_time_utc,
mmsi,
geo_wkb,
st_geomfromwkb(geo_wkb) as geo_wkb_point,
geo_wkt
from ais
),
point_list as (
select
date_time_utc::DATE as date_utc,
mmsi,
collect_list(geo_wkb_point) as geo_wkb_point_list, -- binary
collect_list(geo_wkb) as geo_wkb_list, -- string
collect_list(geo_wkt) as geo_wkt_list --string
from all_geo
group by 1,2
)
select date_utc, mmsi, st_MakeLine(geo_wkb_list) from point_list
-- select date_utc, mmsi, st_MakeLine(geo_wkb_list) from point_list --
IllegalArgumentException: function ST_MakeLine takes at least 2 argument(s)
-- select date_utc, mmsi, st_MakeLine(geo_wkb_point_list) from point_list --
IllegalArgumentException: function ST_MakeLine takes at least 2 argument(s)
-- select date_utc, mmsi, st_MakeLine(geo_wkt_list) from point_list --
IllegalArgumentException: function ST_MakeLine takes at least 2 argument(s)
```
Databricks MOsaic works fine with both `collect_list` and `array_agg`, but
Sedona will not acknowledge the the array as an array
## Settings
Sedona version = 1.5.0
Apache Spark version = 3.4.1
Apache Flink version = ?
API type = SQL, Python
Scala version = 2.12
JRE version = ?
Python version = ?
Environment = Databricks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]