Re: patchwork.ozlabs.org down, and e-mails not recorded

2018-02-13 Thread Andrew Donnellan

On 14/02/18 09:19, Thomas Petazzoni wrote:

Patches 1/3 and 2/3 in this series have been recorded, but not Patch
3/3.

This is really getting annoying for the Buildroot project, and we may
potentially "miss" contributions because of this: we entirely rely on
patchwork as our TODO-list, so if a patch is missing in patchwork, we
will forget about it. When only a few patches within a series are
missing, we obviously notice. But for single patches, when they are not
recorded, we simply miss them entirely.

Can we do something about this ? We're really happy otherwise by the
patchwork instance at ozlabs.org, and we would hate having to run our
own instance :-/


We've done some digging...

https://patchwork.ozlabs.org/patch/872845/
https://patchwork.ozlabs.org/patch/872846/

These are the other two patches in the series, and yet patchwork has 
assigned them two different series, so evidently something screwy is 
happening there.


sfr took a look at the mail logs for us, it looks like:

- all 3 emails were received and processed by our SMTP system
- patch 2 was received first, and patches 1 and 3 were received during 
the same second a couple of seconds later


Patch 3 has not hit the database at all, it's possible we've hit some 
race condition somewhere that prevented it from being saved at the same 
time as patch 1, or perhaps we ran into something completely 
different... we're going to add some extra logging to see if we can 
capture more info next time this happens.


Many apologies for the inconvenience!


Andrew

--
Andrew Donnellan  OzLabs, ADL Canberra
andrew.donnel...@au1.ibm.com  IBM Australia Limited

___
Patchwork mailing list
Patchwork@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/patchwork


Re: [PATCH v2] Avoid timezone confusion

2018-02-13 Thread Veronika Kabatova

- Original Message -
> From: "Daniel Axtens" 
> To: "Veronika Kabatova" , "Stephen Finucane" 
> 
> Cc: patchwork@lists.ozlabs.org
> Sent: Tuesday, February 13, 2018 12:36:37 PM
> Subject: Re: [PATCH v2] Avoid timezone confusion
> 
> Veronika Kabatova  writes:
> 
> > - Original Message -
> >> From: "Stephen Finucane" 
> >> To: vkaba...@redhat.com, patchwork@lists.ozlabs.org
> >> Sent: Wednesday, January 17, 2018 8:22:27 PM
> >> Subject: Re: [PATCH v2] Avoid timezone confusion
> >> 
> >> On Wed, 2018-01-17 at 19:18 +, Stephen Finucane wrote:
> >> > On Tue, 2018-01-16 at 18:58 +0100, vkaba...@redhat.com wrote:
> >> > > From: Veronika Kabatova 
> >> > > 
> >> > > Patchwork saves patches, comments etc with UTC timezone and reports
> >> > > this time when opening the patch details. However, internally
> >> > > generated
> >> > > processes such as events are reported with the instance's local
> >> > > time.
> >> > > There's nothing wrong with that and making PW timezone-aware would
> >> > > add
> >> > > useless complexity, but in a world-wide collaboration a lot of
> >> > > confusion
> >> > > may arise as the timezone is not reported at all. Instance's local
> >> > > time
> >> > > might be very different from the local time of CI integrating with
> >> > > PW,
> >> > > which is different from the local time of person dealing with it
> >> > > etc.
> >> > > 
> >> > > Use UTC everywhere by default instead of UTC for sumbissions and
> >> > > local
> >> > > timezone for internally generated events (which is not reported).
> >> > > 
> >> > > Signed-off-by: Veronika Kabatova 
> >> > 
> >> > What effect does this have on existing information in the database?
> >> > Does this mean they'll all remain UTC+12 or are they stored in UTC
> >> > format already? The Django docs [1] would lead me to suggest the
> >> > former, given that we don't have USE_TZ set to True.
> >> > 
> >> > I guess you can test that by setting up a deployment, creating
> >> > information, then switching this option over. I'd do this myself but
> >> > I'm not going to have time this week :(
> >> 
> >
> > The effect of the change for the existing data is the same as if the admin
> > decided
> > to override the default TIME_ZONE setting. In the end that's what this
> > change does,
> > move the instance time to UTC.
> >
> > If the TIME_ZONE setting is overriden (production.py) nothing changes, we
> > are only
> > changing the default.
> >
> >
> > To elaborate more, as we use timezone-naive format, the details depend on
> > database
> > backend and what underlying type from it Django uses to store datetimes.
> >
> > For mysql, they are saved as 'datetime' and this type doesn't know
> > timezones and
> > returns what you feed it. Django doesn't modify the times either, generated
> > time
> > is simply passed as a string. If you change timezone, 'datetime' is still
> > the
> > same. So submissions will have the right time, but internally triggered
> > changes
> > keep the original localized time and we can't really modify them since we
> > can't
> > retrieve the original timezone used.
> >
> > For postgres it's the exact opposite. Datetimes are stored as 'timestamp
> > with
> > time zone' in the DB, if no timezone is specified it uses the current local
> > one.
> > So it takes the times from submissions, thinking they are in local time
> > zone and
> > not UTC. OTOH, internally generated events are converted correctly exactly
> > because of the local timezone and consecutive conversion.
> >
> > I have not verified what types are used for other backends but I believe
> > they
> > are similar to the mysql example.
> >
> >
> >> Might be relevant.
> >> 
> >>   https://gist.github.com/aaugustin/2971450
> >> 
> >
> > That's something different. We need to unify the timezones used. Making PW
> > timezone-aware would not change anything, including the effect on old data
> > (exactly because the dates used are not unified, and we have no way to find
> > out the offsets the data were originally created with anyways).
> >
> Wow what a mess :( To make matters worse, Australia/Canberra osciallates
> between UTC+10 and UTC+11! So I agree we just completely give up on the
> correctness of previous data.
> 

OK, this explains the additional inconsistency I've seen and couldn't
explain myself.

> Rather than changing the TIME_ZONE setting, can we use
> datetime.datetime.utcnow as the default for date in Event in models.py,
> rather than datetime.datetime.now?
> 
> We might need to clean up some other uses - PatchChangeNotification and
> Check look like likely suspects. I'm happy to go hunting for those, I
> just want to know if there's a more fundamental reason that I've missed
> why that wouldn't work.
> 

Sure, I'll fix all the .now occurrences. I don't see a reason why it
shouldn't work either.

Veronika


> 

Re: [PATCH v2] Avoid timezone confusion

2018-02-13 Thread Daniel Axtens
Veronika Kabatova  writes:

> - Original Message -
>> From: "Stephen Finucane" 
>> To: vkaba...@redhat.com, patchwork@lists.ozlabs.org
>> Sent: Wednesday, January 17, 2018 8:22:27 PM
>> Subject: Re: [PATCH v2] Avoid timezone confusion
>> 
>> On Wed, 2018-01-17 at 19:18 +, Stephen Finucane wrote:
>> > On Tue, 2018-01-16 at 18:58 +0100, vkaba...@redhat.com wrote:
>> > > From: Veronika Kabatova 
>> > > 
>> > > Patchwork saves patches, comments etc with UTC timezone and reports
>> > > this time when opening the patch details. However, internally
>> > > generated
>> > > processes such as events are reported with the instance's local
>> > > time.
>> > > There's nothing wrong with that and making PW timezone-aware would
>> > > add
>> > > useless complexity, but in a world-wide collaboration a lot of
>> > > confusion
>> > > may arise as the timezone is not reported at all. Instance's local
>> > > time
>> > > might be very different from the local time of CI integrating with
>> > > PW,
>> > > which is different from the local time of person dealing with it
>> > > etc.
>> > > 
>> > > Use UTC everywhere by default instead of UTC for sumbissions and
>> > > local
>> > > timezone for internally generated events (which is not reported).
>> > > 
>> > > Signed-off-by: Veronika Kabatova 
>> > 
>> > What effect does this have on existing information in the database?
>> > Does this mean they'll all remain UTC+12 or are they stored in UTC
>> > format already? The Django docs [1] would lead me to suggest the
>> > former, given that we don't have USE_TZ set to True.
>> > 
>> > I guess you can test that by setting up a deployment, creating
>> > information, then switching this option over. I'd do this myself but
>> > I'm not going to have time this week :(
>> 
>
> The effect of the change for the existing data is the same as if the admin 
> decided
> to override the default TIME_ZONE setting. In the end that's what this change 
> does,
> move the instance time to UTC.
>
> If the TIME_ZONE setting is overriden (production.py) nothing changes, we are 
> only
> changing the default.
>
>
> To elaborate more, as we use timezone-naive format, the details depend on 
> database
> backend and what underlying type from it Django uses to store datetimes.
>
> For mysql, they are saved as 'datetime' and this type doesn't know timezones 
> and
> returns what you feed it. Django doesn't modify the times either, generated 
> time
> is simply passed as a string. If you change timezone, 'datetime' is still the
> same. So submissions will have the right time, but internally triggered 
> changes
> keep the original localized time and we can't really modify them since we 
> can't
> retrieve the original timezone used.
>
> For postgres it's the exact opposite. Datetimes are stored as 'timestamp with
> time zone' in the DB, if no timezone is specified it uses the current local 
> one.
> So it takes the times from submissions, thinking they are in local time zone 
> and
> not UTC. OTOH, internally generated events are converted correctly exactly
> because of the local timezone and consecutive conversion.
>
> I have not verified what types are used for other backends but I believe they
> are similar to the mysql example.
>
>
>> Might be relevant.
>> 
>>   https://gist.github.com/aaugustin/2971450
>> 
>
> That's something different. We need to unify the timezones used. Making PW
> timezone-aware would not change anything, including the effect on old data
> (exactly because the dates used are not unified, and we have no way to find
> out the offsets the data were originally created with anyways).
>
Wow what a mess :( To make matters worse, Australia/Canberra osciallates
between UTC+10 and UTC+11! So I agree we just completely give up on the
correctness of previous data.

Rather than changing the TIME_ZONE setting, can we use
datetime.datetime.utcnow as the default for date in Event in models.py,
rather than datetime.datetime.now?

We might need to clean up some other uses - PatchChangeNotification and
Check look like likely suspects. I'm happy to go hunting for those, I
just want to know if there's a more fundamental reason that I've missed
why that wouldn't work.

Regards,
Daniel


>
> Hope this explains the situation, just ask if I wasn't clear
>
> Veronika
>
>
>> Stephen
>> 
>> > Stephen
>> > 
>> > [1] https://docs.djangoproject.com/en/2.0/topics/i18n/timezones/
> ___
> Patchwork mailing list
> Patchwork@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/patchwork
___
Patchwork mailing list
Patchwork@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/patchwork


Re: [PATCH] Implement list filtering

2018-02-13 Thread Daniel Axtens
Hi Veronika,

Apologies for the delay in getting back to you, and thanks for your responses.
 
>> Consider a list like netdev. It has at least a couple of userspace
>> projects (iproute2, ethtool) that could conceivably be spun off into
>> their own lists, and which could use subject matches, but most patches
>> are for the linux kernel and do not have usable subject matches.
>> 
>> As I understand it, you would need to have 3 projects:
>>  - for iproute2, subject-match \[[^\]]*iproute[^\]]*\]
>>  - for ethtool, subject-match \[[^\]]*ethtool[^\]]*\]
>> Then how would you match the rest of the list? You could have an empty
>> subject-match, but then if that is loaded first out of the database, it
>> would match first and patches would never make it to the iproute2 or
>> ethtool projects.
>> 
>
> My original plan was that if you need both matching projects and a default
> one on top of that you can just do a negative lookahead.
>
>> I think you need some sort of default or fall-back or last-resort - the
>> name is not important: just that it will match *only* if no other
>> project matches.
>> 
>> Does that sound right?
>> 
>
> Using a default project is doable and makes sense. Would it be acceptable
> if we first checked for subject matches and in case no match was found,
> checked for a project with same list_id and empty subject_match or do you
> have something different in mind?

That sounds like a good strategy. I certainly prefer it to negative
lookahead.

>
>> I have some other minor comments which I have included throughout:
>> 
>> > diff --git a/docs/api.yaml b/docs/api.yaml
>> > index 3e79f0b..3373226 100644
>> > --- a/docs/api.yaml
>> > +++ b/docs/api.yaml
>> > @@ -374,6 +374,9 @@ definitions:
>> >list_id:
>> >  type: string
>> >  description: Mailing list identifier for project.
>> > +  subject_match:
>> > +type: string
>> > +description: Regex used for email filtering.
>> This is good.
>> 
>> > diff --git a/docs/deployment/management.rst
>> > b/docs/deployment/management.rst
>> > index c50b7b6..870d7ee 100644
>> > --- a/docs/deployment/management.rst
>> > +++ b/docs/deployment/management.rst
>> > @@ -63,7 +63,7 @@ due to, for example, an outage.
>> >  .. option:: --list-id 
>> >  
>> > mailing list ID. If not supplied, this will be extracted from the mail
>> > -   headers.
>> > +   headers. Don't use this option if you require filtering based on
>> > subjects.
>> >  
>> >  .. option:: infile
>> 
>> I don't understand why this would interfere with subject
>> filtering. I notice below that you've added a new method that matches by
>> list-id and subject and kept the old list-id matching one - does the
>> command-line option use the old method?
>> 
>> I think I would prefer that there only be one system of mapping mails to
>> projects and for this restriction to go away, but if there's a good
>> reason for it to stay I am always open to persuasion.
>> 
>
> Yes, the command line option uses the old function. The rationale is that
> if there are more projects with same list_id which differ in subject match
> fields, how do you know which one to choose if you check for valid project
> based only on list_id?
>
> If you check the parse_mail() function, it chooses either the
> find_project_by_id() or find_project_by_header() functions based on whether
> list_id option was provided, so I kept this behavior and only changed the
> helper function for find_project_by_header().
>
> It's not hard to change find_project_by_id() to respect subject matches
> too if you want. I just felt that it defeats the purpose of having the
> option to skip header matching at altogether.

Oh I see and I think I understand.

From a conceptual point of view, I see the --list-id option as just
overriding the list id; I don't think it overrides any other headers
(X-Patchwork-* for example).

However, I realised that my motivation is largely that I'd like to take
my existing archives of email, set up a new project with the same
list-id as the default project but a different subject match, and then
use --list-id=patchwork.ozlabs.org to injest my mail archive from a
different list. I realise now that this is just laziness on my part: I
should just change the list ids!

I wonder if anyone else uses the --list-id parameter (especially live)
and how best to serve their needs. I lean towards enabling subject
matching even if it is used, just in case someone feeds a non-list or
multiple lists into the one patchwork project. If anyone has thoughts
here I'd appreciate them.

(If your v2 addresses my other comments then I'll merge it either way,
we can always change it later.)

>
>> > diff --git a/patchwork/api/project.py b/patchwork/api/project.py
>> > index 446c473..597f605 100644
>> > --- a/patchwork/api/project.py
>> > +++ b/patchwork/api/project.py
>> > @@ -39,8 +39,9 @@ class ProjectSerializer(HyperlinkedModelSerializer):
>> >  class Meta:
>> >  model = Project
>> >  

Re: [PATCH 2/2] requirements: Use 'psycopg2-binary' package

2018-02-13 Thread Stephen Finucane
On Fri, 2018-02-09 at 11:52 +, Stephen Finucane wrote:
> This resolves a deprecation warning that's recently been raised:
> 
>   UserWarning: The psycopg2 wheel package will be renamed from release
>   2.8; in order to keep installing from binary please use "pip install
>   psycopg2-binary" instead. For details see:
>   .
> 
> Signed-off-by: Stephen Finucane 

Applied both of these.
___
Patchwork mailing list
Patchwork@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/patchwork