Re: [FRIAM] Sorting Algorithm? AI? Identifying "types" within data

Nicholas Thompson Mon, 09 Jan 2023 10:11:31 -0800

To my uneducated eye, this seemed like one of Jon’s problems.

Sent from my Dumb Phone

On Jan 7, 2023, at 6:23 AM, Frank Wimberly <wimber...@gmail.com> wrote:

This answer seems reasonable to me. I worked on Project Talent during 1967 which had some similar goals and data. See

https://en.m.wikipedia.org/wiki/Project_Talent

Our data was for thousands of highschool students and our software was all written in Fortran.

---
Frank C. Wimberly
140 Calle Ojo Feliz,
Santa Fe, NM 87505

505 670-9918
Santa Fe, NM

On Fri, Jan 6, 2023, 11:32 PM Pieter Steenekamp <piet...@randcontrols.co.za> wrote:

I asked https://chat.openai.com/chat and here is the conversation:

Pieter Steenekamp
can you suggest a solution for the following problem "I'm hoping someone here could help out. Let's imagine I had some data where each row was a person's career. We could list major events every year.For example: 2004 they were highered, 2007 they get a promotion, 2010 they leave for a different company, 2012 they come back at a higher level, 2015 get a promotion, then no change until 2022. Let's say I had data like this for roughly 2 million people, and that there are around 10 different types of changes that could happen during any time period (could be yearly, quarterly, monthly, I can make it how I want). I was hoping we could ask a computer to tell us if there were "types of careers" that people had. We could say "put all these careers into 4 buckets" or "7 buckets" based on similarity. Then we could look at the piles the computer made and try to make sense of them. One type might be "company man" for people who tend to stay in place for 20 or more years, another type could be a "rotator", who leaves and returns every 3 years or so. Etc. The point is, I want a computer to make the piles for me, rather than trying to come up with potential piles a priori. Are there methods for doing this? I know it's a problem we've *talked* about a lot, but I don't know if there are solutions. "
https://chat.openai.com/chat
There are several approaches that you could use to group the careers of these 2 million people into clusters based on similarity. One approach is to use unsupervised machine learning techniques, specifically clustering algorithms, to automatically group the data into clusters. Some popular clustering algorithms that you could use for this problem include k-means, hierarchical clustering, and DBSCAN (Density-Based Spatial Clustering of Applications with Noise). You could also use dimensionality reduction techniques, such as PCA (Principal Component Analysis) or t-SNE (t-Distributed Stochastic Neighbor Embedding), to reduce the dimensionality of the data and then use a clustering algorithm to group the data into clusters. Another approach is to use supervised machine learning techniques, where you would need to manually label a subset of the data with the cluster labels, and then use this labeled data to train a classifier to predict the cluster labels for the remaining data. It's worth noting that these approaches will require some feature engineering, where you will need to extract relevant features from the data that can be used to represent the careers of the people in the dataset. Once you have identified and extracted these features, you can then use them as input to the machine learning algorithms to group the data into clusters.
Pieter Steenekamp
would it then be possible to get for example if there were "types of careers" that people had?
https://chat.openai.com/chat
Yes, once you have grouped the careers of the 2 million people into clusters using one of the approaches I mentioned above, you can then analyze the characteristics of each cluster to identify common patterns and trends within each cluster. This will allow you to identify "types of careers" that people have, based on the similarity of their career events. For example, if one of the clusters is characterized by people who have had long tenures at a single company, you could label this cluster as "company man" careers. Similarly, if another cluster is characterized by people who frequently switch between companies, you could label this cluster as "rotator" careers. You can then use this information to gain insights into the types of careers that are represented in the dataset, and potentially draw conclusions about the factors that may influence career paths and success.
Pieter Steenekamp
can you propose python code to do sometingh like this please
https://chat.openai.com/chat
Sure, here is an example of how you could use the scikit-learn library in Python to group the careers of the 2 million people into clusters using the k-means algorithm:
Copy code
from sklearn.cluster import KMeans
# Load the data into a NumPy array
data = ""># Create the k-means model
model = KMeans(n_clusters=4)
# Fit the model to the data
model.fit(data)
# Predict the cluster labels for each data point
labels = model.predict(data)
# Print the cluster labels
print(labels)
This code will group the data into 4 clusters, as specified by the n_clusters parameter. You can adjust this parameter to change the number of clusters that the data is grouped into.
To extract features from the data that can be used as input to the k-means model, you will need to preprocess the data and select relevant features. This will likely involve some feature engineering, which will depend on the specific characteristics of the data and the goals of your analysis.
I hope this helps! Let me know if you have any questions.

On Fri, 6 Jan 2023 at 19:34, Eric Charles <eric.phillip.char...@gmail.com> wrote:
Greetings all,
I'm hoping someone here could help out. Let's imagine I had some data where each row was a person's career. We could list major events every year.

For example: 2004 they were highered, 2007 they get a promotion, 2010 they leave for a different company, 2012 they come back at a higher level, 2015 get a promotion, then no change until 2022.

Let's say I had data like this for roughly 2 million people, and that there are around 10 different types of changes that could happen during any time period (could be yearly, quarterly, monthly, I can make it how I want).

I was hoping we could ask a computer to tell us if there were "types of careers" that people had. We could say "put all these careers into 4 buckets" or "7 buckets" based on similarity. Then we could look at the piles the computer made and try to make sense of them.

One type might be "company man" for people who tend to stay in place for 20 or more years, another type could be a "rotator", who leaves and returns every 3 years or so. Etc. The point is, I want a computer to make the piles for me, rather than trying to come up with potential piles a priori.

Are there methods for doing this? I know it's a problem we've *talked* about a lot, but I don't know if there are solutions.

Any help would be appreciated.

Best,
Eric

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021 http://friam.383.s1.nabble.com/

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021 http://friam.383.s1.nabble.com/

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
1/2003 thru 6/2021 http://friam.383.s1.nabble.com/

-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. .
FRIAM Applied Complexity Group listserv
Fridays 9a-12p Friday St. Johns Cafe   /   Thursdays 9a-12p Zoom 
https://bit.ly/virtualfriam
to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
FRIAM-COMIC http://friam-comic.blogspot.com/
archives:  5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/
  1/2003 thru 6/2021  http://friam.383.s1.nabble.com/

Re: [FRIAM] Sorting Algorithm? AI? Identifying "types" within data

Reply via email to