Re: Rename columns without manually setting them all

Bjørn Jørgensen Wed, 21 Jun 2023 07:14:37 -0700

data = {
    "Employee ID": [12345, 12346, 12347, 12348, 12349],
    "Name": ["Dummy x", "Dummy y", "Dummy z", "Dummy a", "Dummy b"],
    "Client": ["Dummy a", "Dummy b", "Dummy c", "Dummy d", "Dummy e"],
    "Project": ["abc", "def", "ghi", "jkl", "mno"],
    "Team": ["team a", "team b", "team c", "team d", "team e"],
    "01/01/2022": ["OFF", "WO", "WH", "WH", "OFF"],
    "02/01/2022": ["WO", "WO", "OFF", "WH", "WH"],
    "03/01/2022": ["WH", "WH", "WH", "OFF", "WO"],
    "04/01/2022": ["WH", "WO", "WO", "WH", "OFF"],
    "05/01/2022": ["WH", "WH", "OFF", "WO", "WO"],
}


df = ps.DataFrame(data)

# Define dates columns
dates_columns = df.columns[5:]

# Melt the dataframe and count the occurrences
df_melt = df.melt(id_vars=df.columns[:5], value_vars=dates_columns,
var_name="Date", value_name="Status")
df_counts = df_melt.groupby(["Date", "Status"]).size().unstack()
df_counts.sort_index(inplace=True)
df_counts

[image: image.png]

ons. 21. juni 2023 kl. 14:39 skrev Farshid Ashouri <
farsheed.asho...@gmail.com>:

> You can use selectExpr and stack to achieve the same effect in PySpark:
>
>
>
> df = spark.read.csv("your_file.csv", header=True, inferSchema=True)
>
> date_columns = [col for col in df.columns if '/' in col]
>
> df = df.selectExpr(["`Employee ID`", "`Name`", "`Client`", "`Project`",
> "`Team`”]
> + [f"stack({len(date_columns)}, {', '.join([f'`{col}`, `{col}` as
> `Status`' for col in date_columns])}) as (`Date`, `Status`)”])
>
> result = df.groupby("Date", "Status").count()
>
>
>
>
> On 21 Jun 2023, at 11:45, John Paul Jayme <john.ja...@tdcx.com.INVALID>
> wrote:
>
> Hi,
>
> This is currently my column definition :
> Employee ID Name Client Project Team 01/01/2022 02/01/2022 03/01/2022
> 04/01/2022 05/01/2022
> 12345     Dummy x Dummy a abc team a OFF WO  WH WH WH
> As you can see, the outer columns are just daily attendance dates. My goal
> is to count the employees who were OFF / WO / WH on said dates. I need to
> transpose them so it would look like this :
>
> <image.png>
>
> I am still new to pandas. Can you guide me on how to produce this? I am
> reading about melt() and set_index() but I am not sure if they are the
> correct functions to use.
>
>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Re: Rename columns without manually setting them all

Reply via email to