Hi Stephen, On 07/05/2020 15:46, Stephen Finucane wrote: > Welcome, Rohit. > > On Thu, 2020-05-07 at 09:00 +0530, Rohit Sarkar wrote: >>> Message IDs and patch IDs should also be stable/immutable. Message IDs, >>> being a property of _mails_, will be the same across different patchwork >>> instances that consume the same mail. Patch IDs, being a property of the >>> specific database that ingested the patch, will vary from patchwork >>> instance to patchwork instance. >>> >>>> Daniel, I have in mind that there is already some kind of infrastructure >>>> in patchwork for receiving raw patches... AFAIR, Mete implemented an >>>> export routine that eases the first initial import. Is there a >>>> possibility to reliably "receive all new patches since my last pull"? >>> >>> I struggle a little bit to follow the who's importing and exporting from >>> whom, but: >>> >>> - There is now code to extract patches in one go from a patchwork >>> instance. I'd caution you that there are gigabytes of patches in the >>> databases of production instances going back over a decade, so you >>> might find that a challenging data set to acquire and work with. >>> >>> - In terms of 'catching up': I think you're asking if Patchwork will >>> let you _export_ all patches since your last pull, rather than asking >>> if patchwork will let you import patches? I think that makes the most >>> sense in context. If that's the case, then the way I would do that >>> is: >>> >>> a) observe the highest patch ID in the project you are tracking, as >>> patch IDs are always increasing. Note that the same cannot be said >>> about dates - patchwork instances, due to the quirks of email, >>> often get mail out-of-order. You probably want something like: >>> >>> >>> http://patchwork.ozlabs.org/api/patches/?order=-id&project=linuxppc-dev >>> >>> b) Retrieve all email from your last pull to that patch ID. Bear in >>> mind that it is likely that more email will arrive while you are >>> doing this - hence why I suggest fetching the patch ID first! Be >>> careful also of pagination as that can also change if new patches >>> come in. One day we will fix this by adding cursor-based >>> pagination as well but we haven't done it yet. As such you >>> probably want to do this with a different query with the opposite >>> ordering, something like: >>> >>> >>> http://patchwork.ozlabs.org/api/patches/?since=2020-05-01T00%3A00%3A00&project=linuxppc-dev >>> >>> (order=id is implied but wouldn't hurt to specify it, and an API >>> version, in your final code) >> >> I might be missing something, but why does it matter if more patches >> arrive while pulling? PaStA can pull all patches since it's last pull as >> you mentioned. > > I'll also point out the events API [1]. This would be a lighter way to > probe for new patches. In particular, you probably care about the > 'patch-created' event, which occurs every time we receive a new patch. > You can poll for these like so: > > > http://patchwork.ozlabs.org/api/events/?category=patch-created&since=2020-05-01T00%3A00%3A00&project=linuxppc-dev > > Also, this doesn't exist yet, but it would be quite easy to add the > concept of webhooks. With a webhook infrastructure, you'd be able to > configure Patchwork to POST a JSON payload to an arbitrary URL every > time we e.g. receive a new patch. This would allow Patchwork to push
Uh, does that scale? > things to you instead of having to poll. You would have to wait for a > future 3.0 release for this though, assuming you wanted to run against > a public instance. Both approaches, webhooks and events are synchronous methods that only work if there are no interruptions. I'd rather prefer asynchronous methods. Thanks Ralf > > Stephen > > [1] > https://patchwork.readthedocs.io/en/latest/api/rest/schemas/v1.2/#get--api-1.2-events- > _______________________________________________ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork