Use from datetime import date today = date.today()
day = today.strftime("%d/%m/%Y") print(day) to get today's date. cast it to sting testday = str(day) Compare == day == df_date True or False use loc to get row text test_str = test.loc[1][0] String = list in python soo test_str[2] '1' ons. 13. jul. 2022 kl. 08:25 skrev Sid <flinkbyhe...@gmail.com>: > Hi Team, > > I have a dataset like the below one in .dat file: > > 13/07/2022abc > PWJ PWJABC 513213217ABC GM20 05.0000 6/20/39 > #01000count > > Now I want to extract the header and tail records which I was able to do > it. Now, from the header, I need to extract the date and match it with the > current system date. Also, for the tail records, I need to match the number > of actual rows i.e 1 in my case with the values mentioned in the last row. > That is a kind of pattern matching so that I can find '1' in the last row > and say that the actual records and the value in the tail record matches > with each other. > > How can I do this? Any links would be helpful. I think regex pattern > matching should help. > > Also, I will be getting 3 formats for now i.e CSV, .DAT file and .TXT > file. > > So, as per me I could do validation for all these 3 file formats using > spark.read.text().rdd and performing intended operations on Rdds. Just the > validation part. > > Therefore, wanted to understand is there any better way to achieve this? > > Thanks, > Sid > -- Bjørn Jørgensen Vestre Aspehaug 4, 6010 Ålesund Norge +47 480 94 297
Untitled2.pdf
Description: Adobe PDF document
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org