I have a patch available for the voted version at https://issues.apache.org/jira/browse/HIVE-19077. Let me know what you think.
On 27 April 2018 at 15:55, Adam Szita <sz...@cloudera.com> wrote: > Thanks to all for the responses. > As I see it, option 3 is the winning one. Next week I'm going start > working on this one then (unless any objections of course). > > Adam > > On 26 April 2018 at 05:48, Deepak Jaiswal <djais...@hortonworks.com> > wrote: > >> +1 for option 3. Thanks Adam for taking this up again. >> >> Regards, >> Deepak >> >> On 4/25/18, 4:54 PM, "Thejas Nair" <thejas.n...@gmail.com> wrote: >> >> Option 3 seems reasonable. I believe that used to be the state a while >> back (maybe 12 months back or so). >> When 2nd ptest for same jira runs, it checks if the latest patch has >> already been run. >> >> >> On Wed, Apr 25, 2018 at 7:37 AM, Peter Vary <pv...@cloudera.com> >> wrote: >> > I would vote for version 3. It would solve the big patch problem, >> and removes the unnecessary test runs too. >> > >> > Thanks, >> > Peter >> > >> >> On Apr 25, 2018, at 11:01 AM, Adam Szita <sz...@cloudera.com> >> wrote: >> >> >> >> Hi all, >> >> >> >> I had a patch (HIVE-19077) committed with the original aim being >> the >> >> prevention of wasting resources when running ptest on the same >> patch >> >> multiple times: >> >> It is supposed to manage scenarios where a developer uploads >> >> HIVE-XYZ.1.patch, that gets queued in jenkins, then before >> execution >> >> HIVE-XYZ.2.patch (for the same jira) is uploaded and that gets >> queued also. >> >> When the first patch starts to execute ptest will see that patch2 >> is the >> >> latest patch and will use that. After some time the second queued >> job will >> >> also run on this very same patch. >> >> This is just pointless and causes long queues to progress slowly. >> >> >> >> My idea was to remove these duplicates from the queue where I'd >> only keep >> >> the latest queued element if I see more queued entries for the >> same jira >> >> number. It's like when you go grocery shopping and you're already >> in line >> >> at cashier but you realise you also need e.g. milk. You go grab it >> and join >> >> the END of the queue. So I believe it's a fair punishment for >> losing one's >> >> spot in the queue for making amends on their patch. >> >> >> >> That said Deepak made me realise that for big patches this will be >> very >> >> cumbersome due to the need of constant rebasing to avoid conflicts >> on patch >> >> application. >> >> I have three proposals now: >> >> >> >> 1: Leave this as it currently is (with HIVE-19077 committed) - >> *only the >> >> latest queued job will run of the same jira* >> >> pros: no wasting resources to run the same patches more times, >> 'scheduling' >> >> is fair: if you amend you're patch you may loose your original >> spot in the >> >> queue >> >> cons: big patches that are prone to conflicts will be hard to get >> executed >> >> in ptest, devs will have to wait more time for their ptest results >> if they >> >> amend their patches >> >> >> >> 2: *Add a safety switch* to this queue checking feature (currently >> proposed >> >> in HIVE-19077), deduplication can be switch off on request >> >> pros: same as 1st, + ability to have more control on this >> mechanism i.e. >> >> turn it off for big/urgent patches >> >> cons: big patches that use the swich might still waste resources, >> also devs >> >> might use safety switch inappropriately for their own evil benefit >> :) >> >> >> >> 3: Deduplication the other way around - *only the first queued job >> will run >> >> of the same jira*, ptest server will keep record of patch names >> and won't >> >> execute a patch with a seen name and jira number again >> >> pros: same patches will not be executed more times accidentally, >> big >> >> patches won't be a problem either, devs will get their ptest >> result back >> >> earlier even if more jobs are triggered for same jira/patch name >> >> cons: scheduling is less fair: devs can reserve their spots in the >> queue >> >> >> >> >> >> (0: restore original: I'm strongly against this, ptest queue is >> already too >> >> big as it is, we have to at least try and decrease its size by >> >> deduplicating jiras in it) >> >> >> >> I'm personally fine with any of the 1,2,3 methods listed above, >> with my >> >> favourites being 2 and 3. >> >> Let me know which one you think is the right path to go down on. >> >> >> >> Thanks, >> >> Adam >> >> >> >> On 20 April 2018 at 20:14, Eugene Koifman < >> ekoif...@hortonworks.com> wrote: >> >> >> >>> Would it be possible to add patch name validation when it gets >> added to >> >>> the queue? >> >>> Currently I think it fails when the bot gets to the patch if it’s >> not >> >>> named correctly. >> >>> More common for branch patches >> >>> >> >>> On 4/20/18, 8:20 AM, "Zoltan Haindrich" <k...@rxd.hu> wrote: >> >>> >> >>> Hello, >> >>> >> >>> Some time ago the ptest queue worked the following way: >> >>> >> >>> * for some reason ATTACHMENT_ID was not set by the upstream >> jira >> >>> scanner >> >>> tool; this triggered a feature in Jenkins: if for the same >> ticket >> >>> mutliple patches were uploaded; they didn't triggered new runs >> >>> (because >> >>> the parameters were the same) >> >>> * this have become fixed at some point...around that time I >> started >> >>> getting multiple ptest executions for the same ticket - >> because I've >> >>> fixed a minor typo after submitting the first version of my >> patch... >> >>> * currently we also have a jenkins queue reader inside the >> ptest >> >>> job...which checks if the ticket is in the queue right now; >> and if is >> >>> it, it just exits...this logic kinda restores the earlier >> behaviour; >> >>> with the exception that if I upload a patch every day and the >> queue is >> >>> longer that 1day (like now); I will never get a ptest run :D >> >>> * ...now here I come! I've just removed my patch from >> yesterday; >> >>> because >> >>> I want a ptest run with my newest patch; and the only way to >> force the >> >>> above logic to do that....is by removing that attachment.. >> >>> >> >>> >> >>> So...could we go back to the state when the attachment_id was >> ignored? >> >>> I would recommend to remove the ATTACHMENT_ID from the jenkins >> >>> parameters... >> >>> >> >>> cheers, >> >>> Zoltan >> >>> >> >>> JenkinsQueueUtil.java: >> >>> https://github.com/apache/hive/blob/f8a671d8cfe8a26d1d12c51f >> 93207e >> >>> c92577c796/testutils/ptest2/src/main/java/org/apache/hive/ >> >>> ptest/api/client/JenkinsQueueUtil.java#L82 >> >>> >> >>> >> >>> >> >>> >> > >> >> >> >> >