GitHub user SarahAsad23 closed a discussion: Improving the Usability of ML Operators
After teaching ML to a group of students using Texera, we noticed that it is not very intuitive to have to split the data twice when creating a ML workflow. The spits produce: 1. Training data - passed to ML operator 2. Test data - passed to ML operator 3. Prediction data - used to do the predictions For reference, here is the workflow that was used: <img width="1972" height="544" alt="IMG_9364" src="https://github.com/user-attachments/assets/473b12d9-9f6f-4e8f-90bc-51083b94c80e" /> I am thinking it would be more intuitive for the train/test split to happen internally within the ML operator, which could remove the first split. GitHub link: https://github.com/apache/texera/discussions/4202 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
