Re: [Tracker] Fwd: tracker 1.11.2

2016-12-09 Thread Philip Van Hoof
On Fri, 2016-12-09 at 00:44 +0100, Carlos Garnacho wrote:

Hello Carlos,

NO criticism on the content of the releases and the technical
improvements, nor on the communication on the security bug and panic of
certain people.

Just some remarks on versioning here.

>   * tracker-extract: Sandbox extractor threads. Filesystem and network
> access are limited to being read and local only.

As a semver fan myself, I wonder why you didn't start 1.12.0 instead of
calling this update 1.11.2?

Sandboxing sounds to me like a new feature. Not just a bugfix.

As Tracker gets used in ever increasing use-cases, ie. not only
desktops, I think our users want a clear version numbering policy. The
semver rules offer just that, and are compatible with API and ABI
changes and packaging formats like Debian and RedHat's.

>   * tracker-miner-fs: Fixed high CPU use when receiving many writeback
> notifications at once.
>   * tracker-extract, libtracker-sparql, libtracker-miner: plug leaks
>   * tests: cleanups and improvements

Those two are worthy of a 1.11.2, yes. So I would have released a 1.12.0
containing the sandboxing, and a 1.11.2 with these two.

> Translations: hu

1.11.2


Now, I understand that 1.10.1 was in need of a upgrade too, and
incrementing its version number to 1.11.0 would have conflicted with
existing releases under 1.11.x.

What to do in that case, in my opinion, is to release a 1.12.0 coming
from 1.10.1 (instead of 1.10.2). And a 1.13.0 coming from 1.11.1
(instead of a 1.11.2). The release numbering just jumps over the
existing ones.

I also think that, given that you are maintaining +3 simultaneous
releases, a gitflow setup could be useful (develop, master, hotfix/*,
release/* and feature/*).


Kind regards,

Philip



signature.asc
Description: This is a digitally signed message part
___
tracker-list mailing list
tracker-list@gnome.org
https://mail.gnome.org/mailman/listinfo/tracker-list


Re: [Tracker] Fwd: tracker 1.11.2

2016-12-10 Thread Carlos Garnacho
Hi Philip!,

On Fri, Dec 9, 2016 at 11:46 PM, Philip Van Hoof  wrote:
> On Fri, 2016-12-09 at 00:44 +0100, Carlos Garnacho wrote:
>
> Hello Carlos,
>
> NO criticism on the content of the releases and the technical
> improvements, nor on the communication on the security bug and panic of
> certain people.

:)

>
> Just some remarks on versioning here.
>
>>   * tracker-extract: Sandbox extractor threads. Filesystem and network
>> access are limited to being read and local only.
>
> As a semver fan myself, I wonder why you didn't start 1.12.0 instead of
> calling this update 1.11.2?

Although following its own version numbers, Tracker has been following
lately the gnome schedule. I came to think lately (most nominally,
when rolling this last bunch of tarballs), that the gnome schedule is
indeed too fast paced for Tracker, while we've most often respected
backwards compatibility quite thoroughly. eg. there's distros shipping
1.8 because that's what came out with gnome 3.20, while 1.10 is a
qualitatively better drop-in replacement.

I however think going from "odd minor number is unstable" to semver
versioning with minor!=0 is going to be confusing... I will suggest
the following plan:

- Finish this 1.11.x/1.12 cycle
- Sticking to 1.12.x for as long it's needed while
- Gearing to a Tracker 2.x that switches to using semver approach to
communicate API/ABI changes.

If communicated properly, that'd at least allow us to have a single
last/current stable release to care about most often, and the window
to accumulate backwards compatible changes can be made variable based
on urgency (situations like this don't come up often but you never
know...), I wouldn't mind that this unstable window is still made to
roughly match the 6 months gnome schedule though, and maybe release
2.[x+1] with whatever might accumulate in that period.

How does that sound to you?

I'll also take the opportunity to introduce to the ML the "roadmap"
that's been shaping up in my head for 2.x:

- Getting as close to supporting the full sparql 1.1 spec as possible
in libtracker-data:
  * property paths: last weekend got halfway with this \o/
  * graph management: for DROP GRAPH I think triggers will perform
just fine, CREATE is also easy, for LOAD/MOVE/ADD it looks like we can
unroll into specific updates.
  * the VALUES clause
  * the MINUS filter
  * CONSTRUCT/ASK/DESCRIBE
  * Removing or limiting the extensions we've gained on the way and
are addressed in 1.1 (eg. accepting "AS var" as good, property paths
greatly reduce the need for our property functions,... other
extensions like FTS must stay of course)

- Double checking ontology migration code, ensure it can handle weird
ontology changes more or less elegantly.

- Library-fying tracker-store, and separating ontology for good, so
eg. an irc client wanting to store conversation logs privately can eg.
do:

connection_manager = tracker_open (".../.cache/my-app/private-store",
"/usr/share/ontologies/nepomuk/nmo.ontology", cancellable, &error);

And it gets a local database with just the relevant classes from the
specified ontology. This might need a hardcoded basis though,
xsd/rdf/nrl/dc/tracker at least. As long as that is properly
documented I'm fine with it.

- And of course, keep dbus-based implementations around, I guess we
can't move too far from nepomuk there, as it's already the implicit
contract between miners and all the surrounding ecosystem.

However would be great to have the tracker-store executable be more
generic, so you can make it claim a different dbus name, write to a
different location, construct the database using a different
ontology...

Does this sound sensible? matches reasonably with your own "Tracker as
generic SPARQL endpoint" ideas?

>
> Sandboxing sounds to me like a new feature. Not just a bugfix.
>
> As Tracker gets used in ever increasing use-cases, ie. not only
> desktops, I think our users want a clear version numbering policy. The
> semver rules offer just that, and are compatible with API and ABI
> changes and packaging formats like Debian and RedHat's.
>
>>   * tracker-miner-fs: Fixed high CPU use when receiving many writeback
>> notifications at once.
>>   * tracker-extract, libtracker-sparql, libtracker-miner: plug leaks
>>   * tests: cleanups and improvements
>
> Those two are worthy of a 1.11.2, yes. So I would have released a 1.12.0
> containing the sandboxing, and a 1.11.2 with these two.
>
>> Translations: hu
>
> 1.11.2
>
>
> Now, I understand that 1.10.1 was in need of a upgrade too, and
> incrementing its version number to 1.11.0 would have conflicted with
> existing releases under 1.11.x.
>
> What to do in that case, in my opinion, is to release a 1.12.0 coming
> from 1.10.1 (instead of 1.10.2). And a 1.13.0 coming from 1.11.1
> (instead of a 1.11.2). The release numbering just jumps over the
> existing ones.
>
> I also think that, given that you are maintaining +3 simultaneous
> releases, a gitflow setup could be useful (develop, master

Re: [Tracker] Fwd: tracker 1.11.2

2016-12-10 Thread Sam Thursfield
Hi!

On Sat, Dec 10, 2016 at 3:36 PM, Carlos Garnacho  wrote:
> Although following its own version numbers, Tracker has been following
> lately the gnome schedule. I came to think lately (most nominally,
> when rolling this last bunch of tarballs), that the gnome schedule is
> indeed too fast paced for Tracker, while we've most often respected
> backwards compatibility quite thoroughly. eg. there's distros shipping
> 1.8 because that's what came out with gnome 3.20, while 1.10 is a
> qualitatively better drop-in replacement.
>
> I however think going from "odd minor number is unstable" to semver
> versioning with minor!=0 is going to be confusing... I will suggest
> the following plan:
>
> - Finish this 1.11.x/1.12 cycle
> - Sticking to 1.12.x for as long it's needed while
> - Gearing to a Tracker 2.x that switches to using semver approach to
> communicate API/ABI changes.

I like this plan.

> I'll also take the opportunity to introduce to the ML the "roadmap"
> that's been shaping up in my head for 2.x:
>
> - Getting as close to supporting the full sparql 1.1 spec as possible
> in libtracker-data:
> ...

Agreed

> - Double checking ontology migration code, ensure it can handle weird
> ontology changes more or less elegantly.

sounds good, but is this actually possible? I thought we found that it
was too hard to really do this well, and it'll be quite a bit of
effort

> - Library-fying tracker-store, and separating ontology for good 

Yes!

> - And of course, keep dbus-based implementations around, I guess we
> can't move too far from nepomuk there, as it's already the implicit
> contract between miners and all the surrounding ecosystem.
>
> However would be great to have the tracker-store executable be more
> generic, so you can make it claim a different dbus name, write to a
> different location, construct the database using a different
> ontology...

Yes.

> PS. I haven't forgotten the "big rip" thread, nor the "Resource table
> fills up with UUIDs" from further in the past, need to get back to
> those...

One other thing to throw out here since we're on the subject of a
roadmap, I don't have strong opinion on if/when this is adopted but
I've been working on new build instructions using Meson:
. They're
maybe 75% complete at this point.

Sam
___
tracker-list mailing list
tracker-list@gnome.org
https://mail.gnome.org/mailman/listinfo/tracker-list


Re: [Tracker] Fwd: tracker 1.11.2

2016-12-10 Thread Philip Van Hoof
On Sat, 2016-12-10 at 16:36 +0100, Carlos Garnacho wrote:


[cut]

> I however think going from "odd minor number is unstable" to semver
> versioning with minor!=0 is going to be confusing...

Yes, changing it in the middle of a major release number is perhaps not
the brightest idea. True.

>  I will suggest the following plan:
> 
> - Finish this 1.11.x/1.12 cycle
> - Sticking to 1.12.x for as long it's needed while
> - Gearing to a Tracker 2.x that switches to using semver approach to
> communicate API/ABI changes.

This makes a lot of sense to me.

> If communicated properly, that'd at least allow us to have a single
> last/current stable release to care about most often, and the window
> to accumulate backwards compatible changes can be made variable based
> on urgency (situations like this don't come up often but you never
> know...),

Well, yes. Usually are security bugfixes just patch increments. But in
this case you solved the situation by adding sandboxing as a feature.

And then semver states that 'you added or changed functionality and or
APIs but didn't break backwards API compatibility' = minor increment.

> I wouldn't mind that this unstable window is still made to
> roughly match the 6 months gnome schedule though, and maybe release
> 2.[x+1] with whatever might accumulate in that period.

Sure. I would however not withhold from doing interim releases, and
assign one of those to the 6 month gnome release.

Not being a monorepo or monolithic architecture liker, I also don't
think that having a cadans dictated by something like gnome is
necessarily a good idea. But that clearly is just an opinion.

I think every project should be independent.

> How does that sound to you?

But yes, makes sense.

> I'll also take the opportunity to introduce to the ML the "roadmap"
> that's been shaping up in my head for 2.x:
> 
> - Getting as close to supporting the full sparql 1.1 spec as possible
> in libtracker-data:
>   * property paths: last weekend got halfway with this \o/
>   * graph management: for DROP GRAPH I think triggers will perform

Did ever something happen to cleaning up anonymous nodes of deleted
subjects/context, and or do reference counting on them (and clear them
once they reach zero references)?

If not then we are still leaking those in the db afaik. We always wanted
to do something about that.

> just fine, CREATE is also easy, for LOAD/MOVE/ADD it looks like we can
> unroll into specific updates.
>   * the VALUES clause
>   * the MINUS filter
>   * CONSTRUCT/ASK/DESCRIBE
>   * Removing or limiting the extensions we've gained on the way and
> are addressed in 1.1 (eg. accepting "AS var" as good, property paths
> greatly reduce the need for our property functions,... other
> extensions like FTS must stay of course)
> 
> - Double checking ontology migration code, ensure it can handle weird
> ontology changes more or less elegantly.

You will have a lot, a lot of fun with that code :-)


> - Library-fying tracker-store, and separating ontology for good, so
> eg. an irc client wanting to store conversation logs privately can eg.
> do:

Yes! :) Want!

> connection_manager = tracker_open (".../.cache/my-app/private-store",
> "/usr/share/ontologies/nepomuk/nmo.ontology", cancellable, &error);
> 
> And it gets a local database with just the relevant classes from the
> specified ontology. This might need a hardcoded basis though,
> xsd/rdf/nrl/dc/tracker at least. As long as that is properly
> documented I'm fine with it.

nod

> - And of course, keep dbus-based implementations around, I guess we
> can't move too far from nepomuk there, as it's already the implicit
> contract between miners and all the surrounding ecosystem.

nod

> However would be great to have the tracker-store executable be more
> generic, so you can make it claim a different dbus name, write to a
> different location, construct the database using a different
> ontology...

> Does this sound sensible? matches reasonably with your own "Tracker as
> generic SPARQL endpoint" ideas?

Totally.


Philip



> > Sandboxing sounds to me like a new feature. Not just a bugfix.
> >
> > As Tracker gets used in ever increasing use-cases, ie. not only
> > desktops, I think our users want a clear version numbering policy. The
> > semver rules offer just that, and are compatible with API and ABI
> > changes and packaging formats like Debian and RedHat's.
> >
> >>   * tracker-miner-fs: Fixed high CPU use when receiving many writeback
> >> notifications at once.
> >>   * tracker-extract, libtracker-sparql, libtracker-miner: plug leaks
> >>   * tests: cleanups and improvements
> >
> > Those two are worthy of a 1.11.2, yes. So I would have released a 1.12.0
> > containing the sandboxing, and a 1.11.2 with these two.
> >
> >> Translations: hu
> >
> > 1.11.2
> >
> >
> > Now, I understand that 1.10.1 was in need of a upgrade too, and
> > incrementing its version number to 1.11.0 would have conflicted with
> > existing releases under 1.11.x.
> 

Re: [Tracker] Fwd: tracker 1.11.2

2016-12-10 Thread Philip Van Hoof
On Sat, 2016-12-10 at 15:52 +, Sam Thursfield wrote:

> > - Double checking ontology migration code, ensure it can handle weird
> > ontology changes more or less elegantly.
> 
> sounds good, but is this actually possible? I thought we found that it
> was too hard to really do this well, and it'll be quite a bit of
> effort

What can never be supported is for example converting a multi value
property or predicate into a single value, without loss of data.

And changing a xsd:string into xsd:integer like conversions without
guaranteed zero loss of data, of course.

But other than that, quite a lot is possible. The difficulty is
detecting the the differences and knowing what to do. Doing it (the
conversion itself), is rather simple. Just create new tables, copy old
to new converting data in the process, delete old, rename new.

The current code also has some pre and post handling here and there.

All the different steps that are involved are what make it 'fun' to
understand how it works ;-). You'll soon hate me when you have to start
working on it. heh. (I hated myself often for not refactoring this into
shape earlier on - it's code that, you know, grew Darwin style. It's a
bit like a muddy pool with lot's of penguin, troll and frog poop).

So please. Whoever starts working on that: don't be ashamed to throw
away and redo it completely. I'll support you completely.

Luckily this code rarely needs to run ...


But when tracker-store is more a library, then I expect more users to
flock to it and actually using it to store their metadatas in. And then
this becomes increasingly important to support. Right now, upon a new
release, device makers just throw the meta.db file away. And let the
miners run again to fill it up again.

The tracker-control binary even has command line support for just that.

And I need to do it to my Jolla sometimes. Because that battery often
comes loose. And then no fsync. And then meta.db becomes funny.

What have we done ...

> > - Library-fying tracker-store, and separating ontology for good 
> 
> Yes!
> 
> > - And of course, keep dbus-based implementations around, I guess we
> > can't move too far from nepomuk there, as it's already the implicit
> > contract between miners and all the surrounding ecosystem.
> >
> > However would be great to have the tracker-store executable be more
> > generic, so you can make it claim a different dbus name, write to a
> > different location, construct the database using a different
> > ontology...
> 
> Yes.
> 
> > PS. I haven't forgotten the "big rip" thread, nor the "Resource table
> > fills up with UUIDs" from further in the past, need to get back to
> > those...
> 
> One other thing to throw out here since we're on the subject of a
> roadmap, I don't have strong opinion on if/when this is adopted but
> I've been working on new build instructions using Meson:
> . They're
> maybe 75% complete at this point.
> 
> Sam
> ___
> tracker-list mailing list
> tracker-list@gnome.org
> https://mail.gnome.org/mailman/listinfo/tracker-list



signature.asc
Description: This is a digitally signed message part
___
tracker-list mailing list
tracker-list@gnome.org
https://mail.gnome.org/mailman/listinfo/tracker-list


Re: [Tracker] Fwd: tracker 1.11.2

2016-12-11 Thread Carlos Garnacho
Hey Philip,

On Sat, Dec 10, 2016 at 4:57 PM, Philip Van Hoof  wrote:
> On Sat, 2016-12-10 at 16:36 +0100, Carlos Garnacho wrote:
>
>
> [cut]
>
>> I however think going from "odd minor number is unstable" to semver
>> versioning with minor!=0 is going to be confusing...
>
> Yes, changing it in the middle of a major release number is perhaps not
> the brightest idea. True.
>
>>  I will suggest the following plan:
>>
>> - Finish this 1.11.x/1.12 cycle
>> - Sticking to 1.12.x for as long it's needed while
>> - Gearing to a Tracker 2.x that switches to using semver approach to
>> communicate API/ABI changes.
>
> This makes a lot of sense to me.

Cool :), seeing that you/Sam agree, sounds like a plan.

>
>> If communicated properly, that'd at least allow us to have a single
>> last/current stable release to care about most often, and the window
>> to accumulate backwards compatible changes can be made variable based
>> on urgency (situations like this don't come up often but you never
>> know...),
>
> Well, yes. Usually are security bugfixes just patch increments. But in
> this case you solved the situation by adding sandboxing as a feature.
>
> And then semver states that 'you added or changed functionality and or
> APIs but didn't break backwards API compatibility' = minor increment.

Right, this would indeed be a reason for breaking the schedule and
releasing, or data loss situations, etc. I however hope those don't
appear often :).

>
>> I wouldn't mind that this unstable window is still made to
>> roughly match the 6 months gnome schedule though, and maybe release
>> 2.[x+1] with whatever might accumulate in that period.
>
> Sure. I would however not withhold from doing interim releases, and
> assign one of those to the 6 month gnome release.
>
> Not being a monorepo or monolithic architecture liker, I also don't
> think that having a cadans dictated by something like gnome is
> necessarily a good idea. But that clearly is just an opinion.
>
> I think every project should be independent.

It is mostly down to convenience, I eg. remember when gtk+ had these 9
to 12 month cycles (in the 2.x days) which was quite messy for
applications, some new features would go virtually unused until the
next gnome cycle, and close enough to gtk+ freezes when they actually
were (and bugs were reported). This incoordination ended up being
unpleasant on both sides.

I get your point, Tracker is not a project used only by gnome, but
it's the community it's always worked closest with, and it's also a
community who's "invested" a lot in it (as in making Tracker a pillar
for several core apps). Its 6 months schedule might not be fit for
everyone, but it's IMHO reliable and short enough.

>
>> How does that sound to you?
>
> But yes, makes sense.
>
>> I'll also take the opportunity to introduce to the ML the "roadmap"
>> that's been shaping up in my head for 2.x:
>>
>> - Getting as close to supporting the full sparql 1.1 spec as possible
>> in libtracker-data:
>>   * property paths: last weekend got halfway with this \o/
>>   * graph management: for DROP GRAPH I think triggers will perform
>
> Did ever something happen to cleaning up anonymous nodes of deleted
> subjects/context, and or do reference counting on them (and clear them
> once they reach zero references)?
>
> If not then we are still leaking those in the db afaik. We always wanted
> to do something about that.

Yeah, we still do leak those... I remember you/Jürg suggested setting
a foreign key with ON DELETE RESTRICT action from the various tables'
ID rows to the Resource table, so the cleaning up the Resource table
would fail for the still referenced nodes. I got as far as seeing
that:

- Performance would be just fine for the common ops, the trigger would
only run when trying to delete the parent key in the Resource table,
which is the once-in-a-while operation, modifications on the cols
setting the foreign key would be just as fast as they're now.

- It however wouldn't fix alone the other issue I saw happening before
the revert (graph URNs being deleted). I had a patch around that added
a Graph table, so IDs in the Resource table were ensured to be in
either rdfs:Resource or the Graph table. That already helped with
identifying and not deleting the graph URNs during garbage collection,
and seems useful for graph management, but I think I can't just add
the same RESTRICT action as CLEAR/DROP GRAPH will want pretty much the
opposite.

- I also wondered if it's more desirable, or allowed by the sparql
spec, that we actually garbage collect the inconsistent nodes. IMO not
leaving this type of data coherence up to miners/apps being educated
when deleting would be a win, but I've only seen mentions in the
sparql spec about impls being free to drop empty graphs, nothing about
triples with no longer bound elements.

>
>> just fine, CREATE is also easy, for LOAD/MOVE/ADD it looks like we can
>> unroll into specific updates.
>>   * the VALUES clause
>>   * the MINUS filter
>

Re: [Tracker] Fwd: tracker 1.11.2

2016-12-11 Thread Philip Van Hoof
On Sun, 2016-12-11 at 17:07 +0100, Carlos Garnacho wrote:

[cut]

> >> - Getting as close to supporting the full sparql 1.1 spec as possible
> >> in libtracker-data:
> >>   * property paths: last weekend got halfway with this \o/
> >>   * graph management: for DROP GRAPH I think triggers will perform
> >
> > Did ever something happen to cleaning up anonymous nodes of deleted
> > subjects/context, and or do reference counting on them (and clear them
> > once they reach zero references)?
> >
> > If not then we are still leaking those in the db afaik. We always wanted
> > to do something about that.
> 
> Yeah, we still do leak those... I remember you/Jürg suggested setting
> a foreign key with ON DELETE RESTRICT action from the various tables'
> ID rows to the Resource table, so the cleaning up the Resource table
> would fail for the still referenced nodes. I got as far as seeing
> that:

> - Performance would be just fine for the common ops, the trigger would
> only run when trying to delete the parent key in the Resource table,
> which is the once-in-a-while operation, modifications on the cols
> setting the foreign key would be just as fast as they're now.

nod

> - It however wouldn't fix alone the other issue I saw happening before
> the revert (graph URNs being deleted). I had a patch around that added
> a Graph table, so IDs in the Resource table were ensured to be in
> either rdfs:Resource or the Graph table. That already helped with
> identifying and not deleting the graph URNs during garbage collection,
> and seems useful for graph management, but I think I can't just add
> the same RESTRICT action as CLEAR/DROP GRAPH will want pretty much the
> opposite.

Personally think some sort of reference counting will be needed for
anonymous nodes references by different graphs..

> - I also wondered if it's more desirable, or allowed by the sparql
> spec, that we actually garbage collect the inconsistent nodes. IMO not
> leaving this type of data coherence up to miners/apps being educated
> when deleting would be a win, but I've only seen mentions in the
> sparql spec about impls being free to drop empty graphs, nothing about
> triples with no longer bound elements.

I also don't think it's a problem to rid ourselves of orphaned anonymous
nodes.

Without a graph to be owned by, they can't be referenced other than by
their uuid anyway.

[cut]

> >> - Double checking ontology migration code, ensure it can handle weird
> >> ontology changes more or less elegantly.
> >
> > You will have a lot, a lot of fun with that code :-)
> 
> Already visited it briefly :P.

:-)

> Kinda replying here to your other email, there will be indeed
> situations of precision or data loss, as long as 1) what is supported
> and what is not is properly documented, 2) we do our best to ensure
> the resulting database represents the current ontology or 3) error out
> and rollback the ongoing changes, I think should be fine.

I guess we could write the old ontology in TTL format alongside the
unconverted data in a TTL file, to allow the user to process it manually
later.

That would allow a distribution, device maker and/or application
developer to provide data conversion tooling.

> I think this will be mainly useful for apps using private
> databases/ontologies, if they are in control of the both the ontology
> and the data, they can also look for ways to preserve or re-extract
> what's interesting while changing their ontology.

Yes, indeed.

> >
> >
> >> - Library-fying tracker-store, and separating ontology for good, so
> >> eg. an irc client wanting to store conversation logs privately can eg.
> >> do:
> >
> > Yes! :) Want!
> 
> Cool :), I think this will be a win for apps considering their data
> precious, not all data is equally disposable. I already feel shivers
> each time I have to recommend tracker reset -r ...


Exactly.


Kind regards,

Philip



signature.asc
Description: This is a digitally signed message part
___
tracker-list mailing list
tracker-list@gnome.org
https://mail.gnome.org/mailman/listinfo/tracker-list