I'm trying to generate multiple rows from a single row
I have schema
Name Id Date 0100 0200 0300 0400
and would like to make it into a vertical format with schema
Name Id Date Time
I have the code below and get the error
Caused by: java.lang.RuntimeException:
Hey Holden,
Thanks for your reply,
We currently using a python function that produces a Row(TS=LongType(),
bin=BinaryType()).
We use this function like this
dataframe.rdd.map(my_function).toDF().write.parquet()
To reuse it in pandas_udf, we changes the return type to
Not currently. What's the problem with pandas_udf for your use case?
On Wed, Jul 25, 2018 at 1:27 PM, Hichame El Khalfi
wrote:
> Hi There,
>
>
> Is there a way to use Arrow format instead of Pickle but without using
> pandas_udf ?
>
>
> Thank for your help,
>
>
> Hichame
>
--
Twitter:
Hi There,
Is there a way to use Arrow format instead of Pickle but without using
pandas_udf ?
Thank for your help,
Hichame
I have enabled the spark.streaming.backpressure.enabled setting and also
set spark.streaming.backpressure.initialRate to 15000, but my spark job
is not respecting these settings when reading from Kafka after a failure.
In my kafka topic around 500k records are waiting for being processed and
We're experiencing the exact same issue while running load tests on Spark
2.3.1 with Structured Streaming and `mapGroupsWithState`.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail:
Hi Elior,
Could you show the query that led to the exception?
Pozdrawiam,
Jacek Laskowski
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams
Exception in thread "main" org.apache.spark.sql.AnalysisException:
collect_set(named_struct(value, country#123 AS value#346, count,
(cast(count(country#123) windowspecdefinit ion(campaign_id#104, app_id#93,
country#123, ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as
double) /
I am trying to build a real-time system with spark (written with scala), but
here are some algorithm file written in python. How can i call the algorithm
file ?
Any idea how to let it work?
I am trying to build a real-time system with spark (written with scala), but
here are some algorithm file written in python. How can i call the algorithm
file ?
Any idea how to let it work?
11 matches
Mail list logo