cyrilou242 commented on PR #15015: URL: https://github.com/apache/pinot/pull/15015#issuecomment-2656597420
> Hey @cyrilou242 👋 I was taking a second look at `PinotIngestionRestletResource` and I realized that this class already has this functionality 😅 This class contains 2 methods/endpoints `ingestFromFile` and `ingestFromURI`, so in order to download a file from URI, one would need to use `/ingestFromURI` endpoint instead of `ingestFromFile` 🙌 > > The test, you've mentioned covers `/ingestFromURI` as well 🙌 > > So, I guess we can close both the PR and the issue, can't we? Hey [DaniilRoman](https://github.com/DaniilRoman) thanks for looking into this! ingestFromURI has limitations, in particular: > `ingestfromURI` can ingest from S3, but does not support public buckets. > `ingestfromURI` is focused on object storage, with a lot of feature related to ingesting by blob pattern (ie by "logical folder" of objects), _it does not support public files over http_ I had a chat with someone from the Pinot team and my understanding was the issue was relevant. I don't think public file over http will ever be implemented in the `ingestfromURI` endpoint. (you can see this endpoint as "ingestFromBucket" instead). So I think the change you propose is still a good contribution. I'll ask again just in case. About the test: My understanding is the test is not testing your change, it is testing with the data coming from the request payload directly. I think we should test with an input that uses your change, namely do the same thing but with data coming from a public URL. We can use the same data as here: https://github.com/apache/pinot/blob/f3e8666273f50ada419a4c2a27b99333dca8513c/pinot-controller/src/test/java/org/apache/pinot/controller/api/PinotIngestionRestletResourceStatelessTest.java#L78 I put it in a public gist to begin with - eventually I think we will put it in the pinot codebase directly, we can review this after the test is implemented. https://gist.githubusercontent.com/cyrilou242/ee95e5c8735755b9453136715b9d330b/raw/ea52d9e5c45dcf003ebb0cca25e4f2057e0b2502/pinotIngestionRestletResourceTest_data.csv -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
