GitHub user Yicong-Huang added a comment to the discussion: Task ideas for the dkNet-AI · Apache Texera Agent Hackathon
it would be great to have it! also consider support folders? (we have a lot of use cases of reading a folder of images, etc). I actually had another direction in mind before: have a LLM operator that reads a sample (or a part) of a file, and create an operator on the fly to read/parse the entire file into table format. then run that operator as a source. so it is not a pre-designed generic operator that supports all kinds of files, but a dynamically generated operator designed specifically for that single file (or a folder of similar files). One use case: I have a business report in pdf which embeds some tables, or other information inside of it, read it out would be very useful. But all pdfs are having different structures which we may not be able to know before seeing the file. same source of the pdfs may share similar sturcutre (e.g., business report generated by the same company across different months). GitHub link: https://github.com/apache/texera/discussions/5059#discussioncomment-16933200 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
