s below, but here I am doing explode and
doing distinct again, But I need to perform the action without doing this
since this will impact performance again for the huge data.
Thanks,
solutions
On Thu, May 16, 2024 at 8:33 AM Karthick Nk wrote:
> Thanks Mich,
>
> I ha
ch-talebzadeh-ph-d-5205b2/>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one te
and "b" both exist in the
> array. So Spark is correctly performing the join. It looks like you need to
> find another way to model this data to get what you want to achieve.
>
> Are the values of "a" and "b" related to each other in any way?
>
> - Da
t; <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as wit
is worth one-thousand
> expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
> Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>
>
> On Thu, 2 May 2024 at 21:25, Karthick Nk wrote:
>
>> Hi All,
>>
>> Requirements:
&
Hi All,
Requirements:
I am working on the data flow, which will use the view definition(view
definition already defined in schema), there are multiple tables used in
the view definition. Here we want to stream the view data into elastic
index based on if any of the table(used in the view
Hi @all,
I am using pyspark program to write the data into elastic index by using
upsert operation (sample code snippet below).
def writeDataToES(final_df):
write_options = {
"es.nodes": elastic_host,
"es.net.ssl": "false",
"es.nodes.wan.only": "true",
Hi All,
I have two dataframe with below structure, i have to join these two
dataframe - the scenario is one column is string in one dataframe and in
other df join column is array of string, so we have to inner join two df
and get the data if string value is present in any of the array of string
Hi Team,
I am using structered streaming in pyspark in azure Databricks, in that I
am creating temp_view from dataframe
(df.createOrReplaceTempView('temp_view')) for performing spark sql query
transformation.
In that I am facing the issue that temp_view not found, so that as a
workaround i have
to perform the required action in an
optimistic way?
Note: Please feel free to ask, if you need further information.
Thanks & regards,
Karthick
On Mon, Oct 2, 2023 at 10:53 PM Karthick Nk wrote:
> Hi community members,
>
> In databricks adls2 delta tables, I need to perform the below
Hi community members,
In databricks adls2 delta tables, I need to perform the below operation,
could you help me with your thoughts
I have the delta tables with one colum with data type string , which
contains the json data in string data type, I need to do the following
1. I have to update one
Hi All,
It will be helpful if we gave any pointers to the problem addressed.
Thanks
Karthick.
On Wed, Sep 20, 2023 at 3:03 PM Gowtham S wrote:
> Hi Spark Community,
>
> Thank you for bringing up this issue. We've also encountered the same
> challenge and are actively workin
time and consideration.
Thanks & regards,
Karthick.
the
tables in a concurrent manner, are this is the issue(so we have any
constraint for it)
For this kind of run time how we can usually identify the root cause of
it?
On Thu, May 11, 2023 at 9:37 PM Farhan Misarwala
wrote:
> Hi Karthick,
>
> I think I have seen this before and this
Hi,
I am trying to merge daaframe with delta table in databricks, but i am
getting error, i have attached the code nippet and error message for
reference below,
code:
[image: image.png]
error:
[image: image.png]
Thanks
Hi @all,
I am using monotonically_increasing_id(), in the pyspark function, for
removing one field from json field in one column from the delta table,
please refer the below code
df = spark.sql(f"SELECT * from {database}.{table}")
df1 = spark.read.json(df.rdd.map(lambda x: x.data), multiLine =
Yeachan Park wrote:
> Hi,
>
> There's a config option for this. Try setting this to false in your spark
> conf.
>
> spark.sql.jsonGenerator.ignoreNullFields
>
> On Tuesday, October 4, 2022, Karthick Nk wrote:
>
>> Hi all,
>>
>> I need to convert pyspark
Hi all,
I need to convert pyspark dataframe into json .
While converting , if all rows values are null/None for that particular
column that column is getting removed from data.
Could you suggest a way to do this. I need to convert dataframe into json
with columns.
Thanks
18 matches
Mail list logo