Re: How use pattern matching in spark

Bjørn Jørgensen Thu, 14 Jul 2022 14:12:38 -0700

Use
from datetime import date

today = date.today()


day = today.strftime("%d/%m/%Y")
print(day)
to get today's date.
cast it to sting testday = str(day)

Compare ==
day == df_date
True or False

use loc to get row text

test_str = test.loc[1][0]

String = list in python soo

test_str[2]

'1'




ons. 13. jul. 2022 kl. 08:25 skrev Sid <flinkbyhe...@gmail.com>:

> Hi Team,
>
> I have a dataset like the below one in .dat file:
>
> 13/07/2022abc
> PWJ   PWJABC 513213217ABC GM20 05.0000 6/20/39
> #01000count
>
> Now I want to extract the header and tail records which I was able to do
> it. Now, from the header, I need to extract the date and match it with the
> current system date. Also, for the tail records, I need to match the number
> of actual rows i.e 1 in my case with the values mentioned in the last row.
> That is a kind of pattern matching so that I can find '1' in the last row
> and say that the actual records and the value in the tail record matches
> with each other.
>
> How can I do this? Any links would be helpful. I think regex pattern
> matching should help.
>
> Also, I will be getting 3 formats for now i.e CSV, .DAT file and .TXT
> file.
>
> So, as per me I could do validation for all these 3 file formats using
> spark.read.text().rdd and performing intended operations on Rdds. Just the
> validation part.
>
> Therefore, wanted to understand is there any better way to achieve this?
>
> Thanks,
> Sid
>


-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Untitled2.pdf
Description: Adobe PDF document

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: How use pattern matching in spark

Reply via email to