Hi,
This adds a column with value "1" (string) *in all rows*:
|df = df.withColumn("uniqueID", lit("1")) |
||This counts the rows for all rows that have the same |uniqueID|,
*which are all rows*. The window does not make much sense.
And it orders all rows that have the same |uniqueID |by |uniqu
Hi Enrico,
Thanks for your time. Much appreciated.
I am expecting the payload to be as a JSON string to be a record like below:
{"A":"some_value","B":"some_value"}
Where A and B are the columns in my dataset.
On Fri, Jun 10, 2022 at 6:09 PM Enrico Minack
wrote:
> Sid,
>
> just recognized yo
Sid,
just recognized you are using Python API here. Then
||struct(*colsListToBePassed))|| should be correct, given it takes a
list of strings.
Your method |call_to_cust_bulk_api| takes argument |payload|, which is a
||Column||. This is then used in |custRequestBody|. That is pretty
strange
Generally it is never a good idea to run processes as root on any
production machines. The main problem is the security problems not found or
disclosed, so if someone malicious takes advantage of a vulnerability like
the ones described below, they can first get in, and little by little
escalate pri
Hi Sid,
||finalDF = finalDF.repartition(finalDF.rdd.getNumPartitions())
.withColumn("status_for_batch", call_to_cust_bulk_api(policyUrl,
to_json(struct(*colsListToBePassed | |
You are calling ||withColumn|| with the result of
||call_to_cust_bulk_api|| as the second argument. That result
Hi Stelios,
Thank you so much for your help.
If I use lit it gives an error of column not iterable.
Can you suggest a simple way of achieving my use case? I need to send the
entire column record by record to the API in JSON format.
TIA,
Sid
On Fri, Jun 10, 2022 at 2:51 PM Stelios Philippou
w
Hi Everyone,
My Security team has raised concerns about the requirement for root group
membership for Spark running on Kubernetes. Does anyone know the reasons
for that requirement, how insecure it is, and any alternatives if at all?
Thanks,
Rodrigo
Sid
Then the issue is on the data in the way you are creating them for that
specific column.
call_to_cust_bulk_api(policyUrl,to_json(struct(*colsListToBePassed)))
Perhaps wrap that in a
lit(call_to_cust_bulk_api(policyUrl,to_json(struct(*colsListToBePassed
else you will need to start sendin
Still, it is giving the same error.
On Fri, Jun 10, 2022 at 5:13 AM Sean Owen wrote:
> That repartition seems to do nothing? But yes the key point is use col()
>
> On Thu, Jun 9, 2022, 9:41 PM Stelios Philippou wrote:
>
>> Perhaps
>>
>>
>> finalDF.repartition(finalDF.rdd.getNumPartitions()).wi