This answer seems reasonable to me. I worked on Project Talent during 1967 which had some similar goals and data. See
https://en.m.wikipedia.org/wiki/Project_Talent Our data was for thousands of highschool students and our software was all written in Fortran. --- Frank C. Wimberly 140 Calle Ojo Feliz, Santa Fe, NM 87505 505 670-9918 Santa Fe, NM On Fri, Jan 6, 2023, 11:32 PM Pieter Steenekamp <piet...@randcontrols.co.za> wrote: > I asked https://chat.openai.com/chat and here is the conversation: > > *Pieter Steenekamp* > can you suggest a solution for the following problem "I'm hoping someone > here could help out. Let's imagine I had some data where each row was a > person's career. We could list major events every year.For example: 2004 > they were highered, 2007 they get a promotion, 2010 they leave for a > different company, 2012 they come back at a higher level, 2015 get a > promotion, then no change until 2022. Let's say I had data like this for > roughly 2 million people, and that there are around 10 different types of > changes that could happen during any time period (could be yearly, > quarterly, monthly, I can make it how I want). I was hoping we could ask a > computer to tell us if there were "types of careers" that people had. We > could say "put all these careers into 4 buckets" or "7 buckets" based on > similarity. Then we could look at the piles the computer made and try to > make sense of them. One type might be "company man" for people who tend to > stay in place for 20 or more years, another type could be a "rotator", who > leaves and returns every 3 years or so. Etc. The point is, I want a > computer to make the piles for me, rather than trying to come up with > potential piles a priori. Are there methods for doing this? I know it's a > problem we've *talked* about a lot, but I don't know if there are > solutions. " > *https://chat.openai.com/chat <https://chat.openai.com/chat>* > There are several approaches that you could use to group the careers of > these 2 million people into clusters based on similarity. One approach is > to use unsupervised machine learning techniques, specifically clustering > algorithms, to automatically group the data into clusters. Some popular > clustering algorithms that you could use for this problem include k-means, > hierarchical clustering, and DBSCAN (Density-Based Spatial Clustering of > Applications with Noise). You could also use dimensionality reduction > techniques, such as PCA (Principal Component Analysis) or t-SNE > (t-Distributed Stochastic Neighbor Embedding), to reduce the dimensionality > of the data and then use a clustering algorithm to group the data into > clusters. Another approach is to use supervised machine learning > techniques, where you would need to manually label a subset of the data > with the cluster labels, and then use this labeled data to train a > classifier to predict the cluster labels for the remaining data. It's worth > noting that these approaches will require some feature engineering, where > you will need to extract relevant features from the data that can be used > to represent the careers of the people in the dataset. Once you have > identified and extracted these features, you can then use them as input to > the machine learning algorithms to group the data into clusters. > *Pieter Steenekamp* > would it then be possible to get for example if there were "types of > careers" that people had? > *https://chat.openai.com/chat <https://chat.openai.com/chat>* > Yes, once you have grouped the careers of the 2 million people into > clusters using one of the approaches I mentioned above, you can then > analyze the characteristics of each cluster to identify common patterns and > trends within each cluster. This will allow you to identify "types of > careers" that people have, based on the similarity of their career events. > For example, if one of the clusters is characterized by people who have had > long tenures at a single company, you could label this cluster as "company > man" careers. Similarly, if another cluster is characterized by people who > frequently switch between companies, you could label this cluster as > "rotator" careers. You can then use this information to gain insights into > the types of careers that are represented in the dataset, and potentially > draw conclusions about the factors that may influence career paths and > success. > *Pieter Steenekamp* > can you propose python code to do sometingh like this please > *https://chat.openai.com/chat <https://chat.openai.com/chat>* > Sure, here is an example of how you could use the scikit-learn library in > Python to group the careers of the 2 million people into clusters using the > k-means algorithm: > Copy code > from sklearn.cluster import KMeans > # Load the data into a NumPy array > data = np.loadtxt('career_data.txt') > # Create the k-means model > model = KMeans(n_clusters=4) > # Fit the model to the data > model.fit(data) > # Predict the cluster labels for each data point > labels = model.predict(data) > # Print the cluster labels > print(labels) > This code will group the data into 4 clusters, as specified by the > n_clusters parameter. You can adjust this parameter to change the number of > clusters that the data is grouped into. > To extract features from the data that can be used as input to the k-means > model, you will need to preprocess the data and select relevant features. > This will likely involve some feature engineering, which will depend on the > specific characteristics of the data and the goals of your analysis. > I hope this helps! Let me know if you have any questions. > > On Fri, 6 Jan 2023 at 19:34, Eric Charles <eric.phillip.char...@gmail.com> > wrote: > >> Greetings all, >> I'm hoping someone here could help out. Let's imagine I had some data >> where each row was a person's career. We could list major events every >> year. >> >> For example: 2004 they were highered, 2007 they get a promotion, 2010 >> they leave for a different company, 2012 they come back at a higher level, >> 2015 get a promotion, then no change until 2022. >> >> Let's say I had data like this for roughly 2 million people, and that >> there are around 10 different types of changes that could happen during any >> time period (could be yearly, quarterly, monthly, I can make it how I >> want). >> >> I was hoping we could ask a computer to tell us if there were "types of >> careers" that people had. We could say "put all these careers into 4 >> buckets" or "7 buckets" based on similarity. Then we could look at the >> piles the computer made and try to make sense of them. >> >> One type might be "company man" for people who tend to stay in place for >> 20 or more years, another type could be a "rotator", who leaves and returns >> every 3 years or so. Etc. The point is, I want a computer to make the piles >> for me, rather than trying to come up with potential piles a priori. >> >> Are there methods for doing this? I know it's a problem we've *talked* >> about a lot, but I don't know if there are solutions. >> >> Any help would be appreciated. >> >> Best, >> Eric >> >> <echar...@american.edu> >> -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . >> FRIAM Applied Complexity Group listserv >> Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom >> https://bit.ly/virtualfriam >> to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com >> FRIAM-COMIC http://friam-comic.blogspot.com/ >> archives: 5/2017 thru present >> https://redfish.com/pipermail/friam_redfish.com/ >> 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ >> > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ >
-. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/