GitHub user usbrandon added a comment to the discussion: Further LLM Support
I agree with you here. What an incredible response. What we may need to ask @hansva and @bamaer about is what methods should we use to represent those feature vectors in Hop. I have seen them materialized as JSON with an array of floating point numbers. We might need to provide options. For example, a present bug/limitation is a JSON type was recently added to Hop, but even the JSON steps do not behave correctly with it because they were created before the base type was, and we have to therefore use Strings for JSON. Where I am going with that is if we introduce an output to a vector database we need to be able to convert between formats so we can feed the vector database what it expects, but also retrieve from it and output to a format humans can deal with, which is probably JSON. At least all of the Python scripts out there seem to focus on loading JSON into lists or dictionaries for further inspection. Those lists and dictionaries and in turn be given to Pandas and scikit-learn for deeper inspection by the user. I love your idea of splitting out the different activities into their own plugins/steps within a single plugin. GitHub link: https://github.com/apache/hop/discussions/4732#discussioncomment-11719979 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
