Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Olivier Goffart

On 23.01.19 23:15, André Pönitz wrote:

On Wed, Jan 23, 2019 at 05:40:33PM +0300, Konstantin Tokarev wrote:

23.01.2019, 16:55, "Edward Welbourne" :

All of this discussion ignores a major elephant: QString's indexing is
by 16-bit UTF-16 tokens, not by Unicode characters. We've had Unicode
for a couple of decades now.

We *should* have a string type (I don't care what you call it) that acts
on strings indexed by Unicode characters, not in terms of a
representation. Whether that string type internally uses UTF-16 or
UTF-8 should be invisible to its user. Ideally it would be capable of
carrying its data internally in either form (so as to avoid needless
conversion when both producer and consumer use the same form) and of
converting between the two (e.g. so as to append efficiently) as needed.


I think this is excessive. Most common operations with strings in application
code are:

* Pass the string around or compare as an opaque token
* Draw the string on screen e.g. with QPainter (while technically it
   falls in the previous category, I think it's important enough to
   deserve separate item)
* Find substring or pattern (regex) inside the string
* Split the string by character, pattern, or index boundaries found by means
   of previous item

I think the only common cases when dealing with Unicode grapheme clusters
is required are

* Handling of text cursor movement
* Implementation of text shaping, i.e. what Harfbuzz is doing

I think having special iterator would be quite enough for cursor case. Such
iterator could abstract away underlying encoding, instead of forcing everyone
to convert to UTF-16 first.


All of that is scarily close to my opinion on the topic.


Same here. I think Konstantin is spot on.

Another example of good string design, I think, is the Rust's String. Their 
string is encoded in valid UTF-8, indexed by bytes, and splitting the string in 
the middle of a code point is a programmer error.


As already mentioned before, UTF-16 is quite a bad choice, if it weren't for 
legacy.


The argument of that developper wrongly using indexes cause more problem with 
utf-8 than with utf-16 ("it would happen for a lot more characters") actually 
means that the developper will see and fix their bugs quickly.


I understand changing QString to UTF-8 is a difficult task if we want to do it 
in a compatible way. However, I think there is a way:

In Qt5.x:
 - Introduce some iterator that iterates over unicode code points.
 - Deprecate utf16()  and other API that assume that QString is UTF-16
 - Replace them by a toUtf16 which returns a QVector.  I believe that 
it is possible to make the cotent implicitly shared with the QString, avoiding 
copies. (since it is just a QTypedArrayData internally)


Then in Qt6 one can simply change the representation without breaking 
compatibility with non-deprecated functions.


--
Olivier

Woboq - Qt services and support - https://woboq.com - https://code.woboq.org




___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread André Pönitz
On Wed, Jan 23, 2019 at 06:36:43PM +, Alex Blasche wrote:
> At the end of the day each cherry-pick is a merge too and they can
> conflict too. The conflict resolution process is still the same. if
> everything is conflict free then a git merge would be no more
> difficult than a cherry-pick.

Correct. But with a cherry-pick based model it can be decided
case by case on how far "merging" backwards makes sense, based
on costs and benefits for each individual hop.

I think this has the potential to ease the overall pain compared
to having to decide on a target branch without knowing the cost
of the then-obligatory merge forward.

Andre'
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread André Pönitz
On Wed, Jan 23, 2019 at 05:40:33PM +0300, Konstantin Tokarev wrote:
> 23.01.2019, 16:55, "Edward Welbourne" :
> > All of this discussion ignores a major elephant: QString's indexing is
> > by 16-bit UTF-16 tokens, not by Unicode characters. We've had Unicode
> > for a couple of decades now.
> >
> > We *should* have a string type (I don't care what you call it) that acts
> > on strings indexed by Unicode characters, not in terms of a
> > representation. Whether that string type internally uses UTF-16 or
> > UTF-8 should be invisible to its user. Ideally it would be capable of
> > carrying its data internally in either form (so as to avoid needless
> > conversion when both producer and consumer use the same form) and of
> > converting between the two (e.g. so as to append efficiently) as needed.
> 
> I think this is excessive. Most common operations with strings in application
> code are:
>
> * Pass the string around or compare as an opaque token
> * Draw the string on screen e.g. with QPainter (while technically it
>   falls in the previous category, I think it's important enough to
>   deserve separate item)
> * Find substring or pattern (regex) inside the string
> * Split the string by character, pattern, or index boundaries found by means
>   of previous item
> 
> I think the only common cases when dealing with Unicode grapheme clusters
> is required are
>
> * Handling of text cursor movement
> * Implementation of text shaping, i.e. what Harfbuzz is doing
> 
> I think having special iterator would be quite enough for cursor case. Such
> iterator could abstract away underlying encoding, instead of forcing everyone
> to convert to UTF-16 first.

All of that is scarily close to my opinion on the topic.

Andre'
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Allan Sandfeld Jensen
On Mittwoch, 23. Januar 2019 19:43:06 CET Konstantin Tokarev wrote:
> 23.01.2019, 21:38, "Alex Blasche" :
> >> 
> >> From: Martin Smith
> >> If you make all patches in dev and then cherrypick them back to earlier
> >> versions that need them, why would you ever do a merge?> 
> > At the end of the day each cherry-pick is a merge too and they can
> > conflict too. The conflict resolution process is still the same. if
> > everything is conflict free then a git merge would be no more difficult
> > than a cherry-pick.
> And when conflicts are present, cherry-picking N patches may result in N
> times more work than merge in worst case (and same amount of work in the
> best case)

More than that. Once you have had cherry-pick only for a while git will be 
unable to find useful common ancestors for the changes, and will be unable to 
do smart three-way merging of cherry-picks, increasing the number of conflicts 
that needs to be resolved manually while decreasing the useful information git 
can give you (no more useful three-way diff).

'Allan




___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Allan Sandfeld Jensen
On Mittwoch, 23. Januar 2019 16:51:10 CET Jedrzej Nowacki wrote:
>   Proposal in short: let's use cherry-pick mode everywhere.
> 
>   All(**) changes would go to dev. From which they would be automatically 
> cherry-picked by a bot to other branches. The decision to which branch
> cherry-
 pick, would be taken based on a tag in the commit message. We
> could add a footer that marks the change risk level as in quip-5
> (http://quips-qt-io.herokuapp.com/quip-0005.html), so for example "dev",
> "stable", "LTS". By default everything would be cherry-picked to qt6 branch
> unless "no-future" tag would be given. Of course we can bike-shed about the
> tag names.
> 
I don't see any advantage to this what so ever. The same amount of work and 
refactoring needs to be done, all you have done is made development more prone 
to human error, and fixes less likely to reach their intended target, and made 
getting point releases out on time harder as they need to go through more 
steps before they have all their patches in.

'Allan


___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Allan Sandfeld Jensen
On Mittwoch, 23. Januar 2019 21:42:35 CET Edward Welbourne wrote:
> Jedrzej Nowacki wrote:
> >>  Advantages:
> >>  - no waiting for merges, a fix can be used right away after
> >>  
> >>integration
> >>  
> >>  - faster propagation of fixes targeting all branches, as there are
> >>  
> >>no merges of merges
> 
> Alex Blasche (23 January 2019 18:09)
> 
> > This is pure speculation because you potentially triple (or worse) the
> > amount of individual patches requiring a merge in gerrit when you
> > consider that you want to at least merge to 5.9, 512, dev and qt6. I
> > don't see this prediction come true.
> 
> Well, as soon as it hits dev, the patch is cherry-picked to every branch
> that its footer says it belongs in.  Those branches where all goes well
> see it one integration later.  Each branch that has a conflict needs
> that resolved before we get to that second integrtion.  Contrast this
> with a 5.12 -> 5.13 -> dev chain of merges, where dev doesn't get the
> change that landed in 5.12 (even if that change could land in dev
> without conflict) until
>  * there's been some delay between their change being accepted in 5.12
>and the next merge-bot run
>  * everyone who made change to 5.12 that conflicted on merging to 5.13
>has advised Liang on how to resolve their conflicts
>  * we've got the result through integration into 5.13
>  * everyone who's made changes to 5.13 or (as possibly just amended in
>merging) 5.12 that conflicts with anything in dev has again advised
>how to resolve their conflicts
>  * and we've got the result through a second integration, into dev.
> 
> When nothing but the change being considered has a conflict along
> the way, that's twice as long; and any change to an upstream branch,
> that does have a conflict, introduces delay for all the other changes
> that landed in that branch, even if they don't have conflicts.  In the
> middle of summer, when lots of folk are away on holiday, getting help
> with resolving conflicts isn't easy - the folk who know those commits
> won't be back for a month - and all changes, no matter how urgent, get
> stuck behind any conflict we can't work out how to resolve.
> 
> So no, Jedrzej's claim is not *pure* speculation; it's at least quite a
> lot adulterated with reasons to believe that many changes would, under
> his scheme, propagate to all branches they're destined for sooner than
> happens with our present scheme.
> 
No, it is speculation, and it optimizing the least important case: bug-fixes 
in dev. Dev is the branch that can wait the longest to get a bug-fix, the 
stable branch is the branch that need it the most urgent. And fixing a bug in 
5.12 now means you first fix it where you need it (5.12), then rewrite it for 
dev, then resolve the inevitable conflict back so it can be merged, all 
waiting for bots and release teams to stumple into the issues and delaying the 
next 5.12.x release.

'Allan


___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Edward Welbourne
On Wednesday, 23 January 2019 11:01:53 PST Volker Hilsheimer wrote:
>> I think that’s fine. What’s much worse is having a fix in 5.12 and not
>> knowing how to deal with the merge conflicts into dev. That means dev might
>> regress, unless whoever authored the change is willing to spend time on
>> making it work. In the end, if contributors can’t own their changes for all
>> various branches of Qt, then I much prefer for them to own the changes at
>> least for dev. And with Qt 6, this will become a much bigger problem.

Thiago Macieira (23 January 2019 20:10)
> The problem is I can turn this around and say that we introduce regressions
> into the older branches due to an improper cherry-pick that didn't conflict.

and *that* is a concern that does bother me.
Of course, it's got to pass integration, as well as not conflict,
but that doesn't guarantee it hasn't broken something.

Eddy.
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Edward Welbourne
Jedrzej Nowacki wrote:
>>  Advantages:
>>  - no waiting for merges, a fix can be used right away after
>>integration
>>  - faster propagation of fixes targeting all branches, as there are
>>no merges of merges

Alex Blasche (23 January 2019 18:09)
> This is pure speculation because you potentially triple (or worse) the
> amount of individual patches requiring a merge in gerrit when you
> consider that you want to at least merge to 5.9, 512, dev and qt6. I
> don't see this prediction come true.

Well, as soon as it hits dev, the patch is cherry-picked to every branch
that its footer says it belongs in.  Those branches where all goes well
see it one integration later.  Each branch that has a conflict needs
that resolved before we get to that second integrtion.  Contrast this
with a 5.12 -> 5.13 -> dev chain of merges, where dev doesn't get the
change that landed in 5.12 (even if that change could land in dev
without conflict) until
 * there's been some delay between their change being accepted in 5.12
   and the next merge-bot run
 * everyone who made change to 5.12 that conflicted on merging to 5.13
   has advised Liang on how to resolve their conflicts
 * we've got the result through integration into 5.13
 * everyone who's made changes to 5.13 or (as possibly just amended in
   merging) 5.12 that conflicts with anything in dev has again advised
   how to resolve their conflicts
 * and we've got the result through a second integration, into dev.

When nothing but the change being considered has a conflict along
the way, that's twice as long; and any change to an upstream branch,
that does have a conflict, introduces delay for all the other changes
that landed in that branch, even if they don't have conflicts.  In the
middle of summer, when lots of folk are away on holiday, getting help
with resolving conflicts isn't easy - the folk who know those commits
won't be back for a month - and all changes, no matter how urgent, get
stuck behind any conflict we can't work out how to resolve.

So no, Jedrzej's claim is not *pure* speculation; it's at least quite a
lot adulterated with reasons to believe that many changes would, under
his scheme, propagate to all branches they're destined for sooner than
happens with our present scheme.

Of course, the originator of the commit has to resolve any conflicts in
all branches it lands in - but can do so immediately, not one today,
another in mumble days time when the next merge happens, etc. - and has
to whack that Stage button repeatedly on each of those target branches
after review until they get accepted - but these happen in parallel.  So
the individual patches requiring a merge, of which you note there may be
thrice as many, can indeed happen; but they happen only to each patch
separately; if the author of that patch or its reviewers just went on
holiday, they don't delay anyone else's patch's progress.  They can also
be handled in parallel - I can fix all my conflict resolutions at the
same time, my favourite reviewer on the other side of the planet can
review them before I come in the next day, and I can play whack-a-mole
with them all in parallel after that.

Each branch is only one step away from dev.  My change can realistically
hit them all the day after it lands in dev; this is in stark contrast to
a change I've just put into 5.12 (only *one* step away from dev at
present) and I don't know how long I'll have to wait until it reaches
dev and I can follow up with the related fixes that are dev-only
material.  So I have to watch merges until that happens.

My main concern is that all those conflict resolutions are going to need
reviewed (which they do at present, so that's not changed) and they
won't be part of a collective rate-limiting step.  Those helping Liang
resolve conflicts are encouraged to do so by the knowledge that the
merge is holding up lots of other things.  Those reviewing an amended
cherry-pick won't have the same urgency.  Then again, this'll happen
right after they +2ed the original that was cherry-picked.  It should
all be fresh in their minds (except if they went on holiday in between;
but then only that one change suffers, not everything).

> BTW how does this work for change files? I hope we don't have to look
> in release tags to find the change log for a particular release.

Change files happen on non-dev branches, for sure.
As Jedrzej's (**) pointed out, branch-specific changes can still happen.
There just won't be any cherry-picking from them.
And I guess we'll still merge 5.x.y, once it releases, into 5.x.

>>  - simpler for new contributors, always push to dev

> Really? Me being the new guy wanting to fix a bug in 5.12 need to
> magically know that I have to push to dev and know about a magic
> cherry-pick logic and a magic tag in the commit log.

Indeed, the proposed model isn't ideal for new contributors.
However, reviewers can explain it to them.
That's one part of the job of review, after all -
indoctrinating the 

Re: [Development] Proposal: New branch model

2019-01-23 Thread Thiago Macieira
On Wednesday, 23 January 2019 11:01:53 PST Volker Hilsheimer wrote:
> I think that’s fine. What’s much worse is having a fix in 5.12 and not
> knowing how to deal with the merge conflicts into dev. That means dev might
> regress, unless whoever authored the change is willing to spend time on
> making it work. In the end, if contributors can’t own their changes for all
> various branches of Qt, then I much prefer for them to own the changes at
> least for dev. And with Qt 6, this will become a much bigger problem.

The problem is I can turn this around and say that we introduce regressions 
into the older branches due to an improper cherry-pick that didn't conflict.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Thiago Macieira
On Wednesday, 23 January 2019 06:07:37 PST Marco Bubke wrote:
> Would it be not better to use a simple container and then functions on top
> which use a view, so we could use them with any container

If only we had a class that found boundaries in text...

http://doc.qt.io/qt-5/qtextboundaryfinder.html

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Volker Hilsheimer
> On 23 Jan 2019, at 18:09, Alex Blasche  wrote:
> 
> 
>> 
>> From: Development  on behalf of Jedrzej 
>> Nowacki 
>> Advantages:
>> - no waiting for merges, a fix can be used right away after integration
>> - faster propagation of fixes targeting all branches, as there are no merges
> of merges
> 
> This is pure speculation because you potentially triple (or worse) the amount 
> of individual patches requiring a merge in gerrit when you consider that you 
> want to at least merge to 5.9, 512, dev and qt6. I don't see this prediction 
> come true.

Yes, there will be a change in those branches in which you want your change to 
land. That’s not different from today.

Each of these changes will be tested as part of a small set of changes, rather 
than as part of a bulk-merge that includes potentially a ton of changes. This 
is an improvement, as it makes smaller, more frequent changes, with a lower 
overall risk of failure.


> BTW how does this work for change files? I hope we don't have to look in 
> release tags to find the change log for a particular release.

That’s a good point.

What I have experienced in other projects is that the changes files are managed 
in the relevant branch, and then bulk-updated in the master branch once the 
ship has sailed.


>> - simpler for new contributors, always push to dev
> 
> Really? Me being the new guy wanting to fix a bug in 5.12 need to magically 
> know that I have to push to dev and know about a magic cherry-pick logic and 
> a magic tag in the commit log. Right now I need to know I want to fix in 512 
> and push to it. Also the current model does not bother the new guy with 
> myriads of potentially following cherry-picks which may require a larger 
> commitment than he is willing to give. The entire bot logic section below is 
> another non-implicit logic.


The new or casual contributors that want to make small changes can work on dev. 
If they can’t invest the time or energy to make their fixes or changes work in 
more stable branches, then that’s cool; the fix will just not be in any of 
those, unless someone else picks up the slack.

I think that’s fine. What’s much worse is having a fix in 5.12 and not knowing 
how to deal with the merge conflicts into dev. That means dev might regress, 
unless whoever authored the change is willing to spend time on making it work. 
In the end, if contributors can’t own their changes for all various branches of 
Qt, then I much prefer for them to own the changes at least for dev. And with 
Qt 6, this will become a much bigger problem.

As for the learning curve and cognitive load introduced by the cherry-pick 
automation:

The current branching model is not easy to learn or understand, and it is not 
just “fire and forget”, esp if my change causes merge conflicts that someone 
else that knows nothing about my change and perhaps the code I worked on has to 
deal with.

Learning how I need to tag my change if I want it to be applied to more stable 
versions of Qt, supported by an augmented commit template, is probably easy.


>> - in most cases a reviewer does no need to know what the current version
>> number is
> 
> But he does, because you require the right commit tag in the log, don't you?


Reviewers review changes against dev. Of course the reviewers should confirm 
that this change is indeed suitable for cherry-picking into more stable 
branches, as requested by the tagging.

If cherry-picking works without conflicts and without regressions, then the 
cherry-picked change doesn’t need another review. Just as today the merges are 
not code-reviewed.


>> - conflict resolution is distributed and affects mostly the author of the
>> change
> 
> The merge problems we have today don't disappear. The merge problem is 
> shifted from full time developers to once in a life time contributors. The 
> once in a lifetime contributor might not care anymore. To me this sounds like 
> a way to shift the hard problem to somebody who has the least knowledge, 
> commitment and time. The loser will be the Qt stable/lts branches. Add the 
> explosion of changes towards the CI.


See above; if you can’t own your change in multiple branches, then make it in 
dev and accept that the bug won’t be fixed in an LTS series unless someone else 
does the work.

Right now, off-loading the merge-conflict resolution to “someone else’s 
problem” is not effective, even if that someone else gets paid for it.

The explosion of changes towards the CI is a problem of scaling the CI; that 
problem needs to be solved anyway, but is IMHO a poor excuse for favouring 
infrequent, large merges with lots of changes over frequently applying a small 
set of changes.


> True, the distribution is a plus. can we possibly apply this thought to the 
> current approach? How about we distribute the current merge coordination 
> among more experienced developers. If you combine this with simpler 
> dependency management were we 

Re: [Development] Proposal: New branch model

2019-01-23 Thread Martin Smith
I understand cherrypicking can result in conflicts, but surely changing to this 
new model would also require changing the rules.

I would expect that for a particular patch to dev, a decision would be made to 
determine which released versions the patch should be cherrypicked to.

Then the person who patches dev will cherrypick his  change to those versions 
himself and fix the conflicts.


From: Development  on behalf of Konstantin 
Tokarev 
Sent: Wednesday, January 23, 2019 7:43:06 PM
To: Alex Blasche; development@qt-project.org
Subject: Re: [Development] Proposal: New branch model



23.01.2019, 21:38, "Alex Blasche" :
>> 
>> From: Martin Smith
>> If you make all patches in dev and then cherrypick them back to earlier 
>> versions that need them, why would you ever do a merge?
>
> At the end of the day each cherry-pick is a merge too and they can conflict 
> too. The conflict resolution process is still the same. if everything is 
> conflict free then a git merge would be no more difficult than a cherry-pick.

And when conflicts are present, cherry-picking N patches may result in N times
more work than merge in worst case (and same amount of work in the best case)

--
Regards,
Konstantin

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Konstantin Tokarev


23.01.2019, 21:38, "Alex Blasche" :
>> 
>> From: Martin Smith
>> If you make all patches in dev and then cherrypick them back to earlier 
>> versions that need them, why would you ever do a merge?
>
> At the end of the day each cherry-pick is a merge too and they can conflict 
> too. The conflict resolution process is still the same. if everything is 
> conflict free then a git merge would be no more difficult than a cherry-pick.

And when conflicts are present, cherry-picking N patches may result in N times
more work than merge in worst case (and same amount of work in the best case)

-- 
Regards,
Konstantin

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Alex Blasche

>
>From: Martin Smith
>If you make all patches in dev and then cherrypick them back to earlier 
>versions that need them, why would you ever do a merge?

At the end of the day each cherry-pick is a merge too and they can conflict 
too. The conflict resolution process is still the same. if everything is 
conflict free then a git merge would be no more difficult than a cherry-pick.

--
Alex

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Edward Welbourne
Marco Bubke (23 January 2019 15:07) wrote
> Would it be not better to use a simple container and then functions on
> top which use a view, so we could use them with any container.

That sounds just fine to me.

Indeed, in separating the "Unicode text" nature from its encoding, I'm
fine with the *storage* being the encoding and the text being a view of
that storage - just as long as we get an API that lets us deal with
every form of storage (and encoding) consistently in terms of Unicode,
when the code accessing it wants to do that.

Eddy.
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Thiago Macieira
On Wednesday, 23 January 2019 08:37:57 PST Konstantin Tokarev wrote:
> >   Disadvantages:
> >   - git history would be a bit wilder, "git branch --contains" would not
> >   work
> >   - commit messages in some branches would have kind of ugly footer as an
> > 
> > effect of "cherry-pick -x"
> 
> Gerrit's Change-Id can be used to track presence of patch in branches of
> interest

Yes, but not as easily, since the git branch --contains and git tag --contains 
are pure DAG operations. The search you're talking about is a text search 
(usually implemented by a regexp search) on the commit message, with no DAG 
boundaries. You have to scan all valid branches for a given string.

And then you still have to run git branch --contains on each entry you found 
to figure out which branches contain those commits.

So this is scriptable. It's going to be something like 100x slower than today, 
but it should still finish within 10 seconds, even on slow machines.

Test in qtbase (Linux):
Run: git rev-list --grep=5d809703aa2d2a08ae7e9610fd42025b081d3d0c --all
Output: bd38ff3c5456b1f2fc03e4899e73d650ad5f858a
Runtime: 0,52s user 0,03s system 99% cpu 0,555 total

Same test on Windows: real0m0.800s, user0m0.593s, sys 0m0.171s

(Both machines are Intel Skylakes, and both have SSDs and 16 GB of RAM)

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Alex Blasche

>
>From: Development  on behalf of Jedrzej 
>Nowacki 
>  Advantages:
>  - no waiting for merges, a fix can be used right away after integration
>  - faster propagation of fixes targeting all branches, as there are no merges
of merges

This is pure speculation because you potentially triple (or worse) the amount 
of individual patches requiring a merge in gerrit when you consider that you 
want to at least merge to 5.9, 512, dev and qt6. I don't see this prediction 
come true.

BTW how does this work for change files? I hope we don't have to look in 
release tags to find the change log for a particular release.

>  - simpler for new contributors, always push to dev

Really? Me being the new guy wanting to fix a bug in 5.12 need to magically 
know that I have to push to dev and know about a magic cherry-pick logic and a 
magic tag in the commit log. Right now I need to know I want to fix in 512 and 
push to it. Also the current model does not bother the new guy with myriads of 
potentially following cherry-picks which may require a larger commitment than 
he is willing to give. The entire bot logic section below is another 
non-implicit logic.

>  - in most cases a reviewer does no need to know what the current version
>number is

But he does, because you require the right commit tag in the log, don't you?

>  - conflict resolution is distributed and affects mostly the author of the
>change

The merge problems we have today don't disappear. The merge problem is shifted 
from full time developers to once in a life time contributors. The once in a 
lifetime contributor might not care anymore. To me this sounds like a way to 
shift the hard problem to somebody who has the least knowledge, commitment and 
time. The loser will be the Qt stable/lts branches. Add the explosion of 
changes towards the CI.

True, the distribution is a plus. can we possibly apply this thought to the 
current approach? How about we distribute the current merge coordination among 
more experienced developers. If you combine this with simpler dependency 
management were we don't have to wait for qt5.git to merge, the merge task 
distribution becomes easier.


>  - documents a change intent, which may be useful for people keeping own
>forks
>  - over time with increased amount of conflicts old branches, in natural way,
>stay untouched

>  Disadvantages:
>  - git history would be a bit wilder, "git branch --contains" would not work
>  - commit messages in some branches would have kind of ugly footer as an
>effect of "cherry-pick -x"
>  - there is a chance, that some cherry-picked commits may be left forgotten
>in gerrit after a failed integration

I see this as a serious problem and it was one of the biggest if not *the* 
advantage of the current system. And the most experienced devs were responsible 
for ensuring it.


--
Alex
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Volker Hilsheimer
On 23 Jan 2019, at 17:20, Mitch Curtis 
mailto:mitch.cur...@qt.io>> wrote:

[…snip…]

 Advantages:
 - no waiting for merges, a fix can be used right away after integration

Sounds nice!

 - faster propagation of fixes targeting all branches, as there are no merges
of merges
 - simpler for new contributors, always push to dev
 - in most cases a reviewer does no need to know what the current version
number is
 - conflict resolution is distributed and affects mostly the author of the
change

Do I understand this correctly: conflicts would happen when the initial patch 
is pushed to dev and on each subsequent cherry-pick that the bot pushes?

Potentially, yes.

The real advantage is that I, as the developer responsible for the change, can 
decide how to make it work in more stable branches. Anything from “don’t care” 
to “I know it works in LTS" to “make a completely different fix with separate 
code review for the LTS” - my decision, which I can verify locally.

Ie distributed conflict resolution, vs “the one person watching over the 
integration has to deal with all the conflicts”.


 - documents a change intent, which may be useful for people keeping own
forks
 - over time with increased amount of conflicts old branches, in natural way,
stay untouched

What does this mean?


It becomes less likely that we try to make fixes that work in dev and e.g 5.12 
also work in 5.9 if the change doesn’t apply cleanly. That means fewer risky 
changes in older branches.

It also means fewer fixes in older branches, but the decision what to fix or 
not for a certain release is somewhat orthogonal. If we decide that a fix is 
needed for 5.9.x, then this model doesn’t stop us from investing the time on 
making the change.



 Disadvantages:
 - git history would be a bit wilder, "git branch --contains" would not work

This would be a bit of a pain for me personally, as I use this command often to 
e.g. know if I have a certain fix.

Would there be an alternative command?


It’s rather trivial to create a script for this, as long as we use cherry-pick 
-x

Besides, since we do use cherry-picking already today for some branches, git 
branch —contains already produces incomplete/incorrect results.

[…snap…]

 What if we need a release blocker fix, right now!
 -

 The setup prioritize dev. So a release branch would get a fix with a delay of
one dev integration.

So in the end, in terms of number of integrations, it's kinda similar to the 
current approach where the change has to go through two CI integrations? E.g. 
the original change in qtdeclarative.git and then the qt5 submodule update 
before that code can be used in a dependent module like qtquickcontrols2.git?


This proposal doesn’t change anything about that.

Volker

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Proposal: New branch model

2019-01-23 Thread Volker Hilsheimer
On 23 Jan 2019, at 17:08, Jaroslaw Kobus 
mailto:jaroslaw.ko...@qt.io>> wrote:

"All(**) changes would go to dev. From which they would be automatically
cherry-picked by a bot to other branches. The decision to which branch cherry-
pick, would be taken based on a tag in the commit message. We could add a
footer that marks the change risk level as in quip-5"

No sure I understand the above correctly. Let's say in dev branch some source 
file got refactored completely, so that no single line match the old version 
anymore, e.g. Qt 5.9. Now you need to fix the old code, which is in 5.9 branch 
- how in this case you may try to push your fix to dev?

Jarek


That’s where the “with exception of branch specific changes” clause (which the 
** points at) kicks in.

Is the fix needed in dev (or is the bug fixed by the refactoring)?

If yes, then fix it in dev, and then make a separate fix in the relevant LTS 
branches (perhaps starting with the cherry-pick’ed change). Or just accept that 
this bug won’t/can’t be fixed in the pre-refactoring codebase.

If no, then push the fix for the newest branch where it’s needed, from where it 
can be cherry picked further; don’t do anything in dev (including “don’t expect 
someone that knows nothing about your change to deal with the merge conflict”).


Volker





From: Development 
mailto:development-boun...@qt-project.org>> 
on behalf of Jedrzej Nowacki 
mailto:jedrzej.nowa...@qt.io>>
Sent: Wednesday, January 23, 2019 4:51:10 PM
To: development@qt-project.org
Subject: [Development] Proposal: New branch model

Hi,

  It is time to rethink our branch model. We are approaching Qt6 development
and I'm worried that what we have now, will simply not scale. As you know, our
branch model is mainly(*) based on merging from stable up to development
branches. In general, it is a very good model, which make sure that release
branches are not getting obsolete too quickly, that most of the fixes are in
the right places, that every commit is only once in the git history. It is a
very clean model. It is also a very slow model, with a single point of
failure.

  It is hard to maintain:
  My impression is that the current model works great for a single release
branch, but we have: 5.6 5.9 5.12 and soon we will have 5.13, that is without
counting temporary patch level branches. Merging through them is hard, we
already had to use "cherry-pick mode" in some places. When we add to the
picture dev and qt6 branches, merging will be very, very hard. It is
practically a full time job to update qt5 repository and coordinate all the
merges now (thank you Liang!), shortly after qt6 branch opening amount of work
will be much bigger.
  It is slow:
  The merges take time. We produce a lot of code, we have a lot of tests that
needs to pass. Even single failure delays merge propagation by at least one
day. If by bad luck, the merge contains some API incompatible changes an
intermediate jump through Qt5 integration is required, that adds at least 3
days of delay.

  --

  Proposal in short: let's use cherry-pick mode everywhere.

  All(**) changes would go to dev. From which they would be automatically
cherry-picked by a bot to other branches. The decision to which branch cherry-
pick, would be taken based on a tag in the commit message. We could add a
footer that marks the change risk level as in quip-5 
(http://quips-qt-io.herokuapp.com/quip-0005.html), so for example "dev", 
"stable", "LTS". By
default everything would be cherry-picked to qt6 branch unless "no-future" tag
would be given. Of course we can bike-shed about the tag names.

  Advantages:
  - no waiting for merges, a fix can be used right away after integration
  - faster propagation of fixes targeting all branches, as there are no merges
of merges
  - simpler for new contributors, always push to dev
  - in most cases a reviewer does no need to know what the current version
number is
  - conflict resolution is distributed and affects mostly the author of the
change
  - documents a change intent, which may be useful for people keeping own
forks
  - over time with increased amount of conflicts old branches, in natural way,
stay untouched

  Disadvantages:
  - git history would be a bit wilder, "git branch --contains" would not work
  - commit messages in some branches would have kind of ugly footer as an
effect of "cherry-pick -x"
  - there is a chance, that some cherry-picked commits may be left forgotten
in gerrit after a failed integration

  Bot details:

  The bot would listen only for changes in dev, in some unusual cases one
could target an other branch directly, but bot would not trigger automatic
cherry-pick(***). The bot would wait for a successful dev integration before
creating cherry-picked changes. The bot would use cherry-pick -x to annotate
the origin patch. After the cherry-pick creation, it would 

Re: [Development] Proposal: New branch model

2019-01-23 Thread Mitch Curtis
> -Original Message-
> From: Development  On Behalf Of
> Jedrzej Nowacki
> Sent: Wednesday, 23 January 2019 4:51 PM
> To: development@qt-project.org
> Subject: [Development] Proposal: New branch model
> 
> Hi,
> 
>   It is time to rethink our branch model. We are approaching Qt6
> development and I'm worried that what we have now, will simply not scale.
> As you know, our branch model is mainly(*) based on merging from stable up
> to development branches. In general, it is a very good model, which make
> sure that release branches are not getting obsolete too quickly, that most of
> the fixes are in the right places, that every commit is only once in the git
> history. It is a very clean model. It is also a very slow model, with a single
> point of failure.
> 
>   It is hard to maintain:
>   My impression is that the current model works great for a single release
> branch, but we have: 5.6 5.9 5.12 and soon we will have 5.13, that is without
> counting temporary patch level branches. Merging through them is hard, we
> already had to use "cherry-pick mode" in some places. When we add to the
> picture dev and qt6 branches, merging will be very, very hard. It is 
> practically
> a full time job to update qt5 repository and coordinate all the merges now
> (thank you Liang!), shortly after qt6 branch opening amount of work will be
> much bigger.
>   It is slow:
>   The merges take time. We produce a lot of code, we have a lot of tests that
> needs to pass. Even single failure delays merge propagation by at least one
> day. If by bad luck, the merge contains some API incompatible changes an
> intermediate jump through Qt5 integration is required, that adds at least 3
> days of delay.
> 
>   --
> 
>   Proposal in short: let's use cherry-pick mode everywhere.
> 
>   All(**) changes would go to dev. From which they would be automatically
> cherry-picked by a bot to other branches. The decision to which branch
> cherry- pick, would be taken based on a tag in the commit message. We
> could add a footer that marks the change risk level as in quip-5 
> (http://quips-
> qt-io.herokuapp.com/quip-0005.html), so for example "dev", "stable", "LTS".
> By default everything would be cherry-picked to qt6 branch unless "no-
> future" tag would be given. Of course we can bike-shed about the tag
> names.
> 
>   Advantages:
>   - no waiting for merges, a fix can be used right away after integration

Sounds nice!

>   - faster propagation of fixes targeting all branches, as there are no merges
> of merges
>   - simpler for new contributors, always push to dev
>   - in most cases a reviewer does no need to know what the current version
> number is
>   - conflict resolution is distributed and affects mostly the author of the
> change

Do I understand this correctly: conflicts would happen when the initial patch 
is pushed to dev and on each subsequent cherry-pick that the bot pushes?

>   - documents a change intent, which may be useful for people keeping own
> forks
>   - over time with increased amount of conflicts old branches, in natural way,
> stay untouched

What does this mean?

>   Disadvantages:
>   - git history would be a bit wilder, "git branch --contains" would not work

This would be a bit of a pain for me personally, as I use this command often to 
e.g. know if I have a certain fix.

Would there be an alternative command?

>   - commit messages in some branches would have kind of ugly footer as an
> effect of "cherry-pick -x"
>   - there is a chance, that some cherry-picked commits may be left forgotten
> in gerrit after a failed integration
> 
>   Bot details:
> 
>   The bot would listen only for changes in dev, in some unusual cases one
> could target an other branch directly, but bot would not trigger automatic
> cherry-pick(***). The bot would wait for a successful dev integration before
> creating cherry-picked changes. The bot would use cherry-pick -x to annotate
> the origin patch. After the cherry-pick creation, it would push it to gerrit,
> +2 and stage once. It would be up to the author to re-stage in case of
> flakiness. In case of a cherry-pick conflict it should push unresolved 
> conflict to
> gerrit and add all reviewers and author to handle the issue.
> 
> The list below shows branch targets for automatic cherry-pick based on given
> tag.
> 
> dev (qt6)
> stable (qt6, 5.13)
> stable, no-future (5.13)
> LTS (qt6, 5.13, 5.12)
> LTS-strict (qt6, 5.13, 5.12, 5.9)
> LTS, no-future (5.13, 5.12)
> 
> That is assuming that we have branches: qt6  dev  5.13  5.13.q  5.12  5.12.w
> 5.9  5.9.e  5.6  5.6.r
> I think we should assume that patch level branches, as well LTS in very strict
> state, are handled manually.
> 
> 
>   Creation of new branches:
> We would branch more or less as usual. The only difference from the current
> system is that we would not need the back merge / soft branching anymore,
> but of course we could keep it.
> 
> 

Re: [Development] Proposal: New branch model

2019-01-23 Thread Jaroslaw Kobus
"All(**) changes would go to dev. From which they would be automatically
cherry-picked by a bot to other branches. The decision to which branch cherry-
pick, would be taken based on a tag in the commit message. We could add a
footer that marks the change risk level as in quip-5"


No sure I understand the above correctly. Let's say in dev branch some source 
file got refactored completely, so that no single line match the old version 
anymore, e.g. Qt 5.9. Now you need to fix the old code, which is in 5.9 branch 
- how in this case you may try to push your fix to dev?


Jarek


From: Development  on behalf of Jedrzej 
Nowacki 
Sent: Wednesday, January 23, 2019 4:51:10 PM
To: development@qt-project.org
Subject: [Development] Proposal: New branch model

Hi,

  It is time to rethink our branch model. We are approaching Qt6 development
and I'm worried that what we have now, will simply not scale. As you know, our
branch model is mainly(*) based on merging from stable up to development
branches. In general, it is a very good model, which make sure that release
branches are not getting obsolete too quickly, that most of the fixes are in
the right places, that every commit is only once in the git history. It is a
very clean model. It is also a very slow model, with a single point of
failure.

  It is hard to maintain:
  My impression is that the current model works great for a single release
branch, but we have: 5.6 5.9 5.12 and soon we will have 5.13, that is without
counting temporary patch level branches. Merging through them is hard, we
already had to use "cherry-pick mode" in some places. When we add to the
picture dev and qt6 branches, merging will be very, very hard. It is
practically a full time job to update qt5 repository and coordinate all the
merges now (thank you Liang!), shortly after qt6 branch opening amount of work
will be much bigger.
  It is slow:
  The merges take time. We produce a lot of code, we have a lot of tests that
needs to pass. Even single failure delays merge propagation by at least one
day. If by bad luck, the merge contains some API incompatible changes an
intermediate jump through Qt5 integration is required, that adds at least 3
days of delay.

  --

  Proposal in short: let's use cherry-pick mode everywhere.

  All(**) changes would go to dev. From which they would be automatically
cherry-picked by a bot to other branches. The decision to which branch cherry-
pick, would be taken based on a tag in the commit message. We could add a
footer that marks the change risk level as in quip-5 
(http://quips-qt-io.herokuapp.com/quip-0005.html), so for example "dev", 
"stable", "LTS". By
default everything would be cherry-picked to qt6 branch unless "no-future" tag
would be given. Of course we can bike-shed about the tag names.

  Advantages:
  - no waiting for merges, a fix can be used right away after integration
  - faster propagation of fixes targeting all branches, as there are no merges
of merges
  - simpler for new contributors, always push to dev
  - in most cases a reviewer does no need to know what the current version
number is
  - conflict resolution is distributed and affects mostly the author of the
change
  - documents a change intent, which may be useful for people keeping own
forks
  - over time with increased amount of conflicts old branches, in natural way,
stay untouched

  Disadvantages:
  - git history would be a bit wilder, "git branch --contains" would not work
  - commit messages in some branches would have kind of ugly footer as an
effect of "cherry-pick -x"
  - there is a chance, that some cherry-picked commits may be left forgotten
in gerrit after a failed integration

  Bot details:

  The bot would listen only for changes in dev, in some unusual cases one
could target an other branch directly, but bot would not trigger automatic
cherry-pick(***). The bot would wait for a successful dev integration before
creating cherry-picked changes. The bot would use cherry-pick -x to annotate
the origin patch. After the cherry-pick creation, it would push it to gerrit,
+2 and stage once. It would be up to the author to re-stage in case of
flakiness. In case of a cherry-pick conflict it should push unresolved conflict
to gerrit and add all reviewers and author to handle the issue.

The list below shows branch targets for automatic cherry-pick based on given
tag.

dev (qt6)
stable (qt6, 5.13)
stable, no-future (5.13)
LTS (qt6, 5.13, 5.12)
LTS-strict (qt6, 5.13, 5.12, 5.9)
LTS, no-future (5.13, 5.12)

That is assuming that we have branches: qt6  dev  5.13  5.13.q  5.12  5.12.w
5.9  5.9.e  5.6  5.6.r
I think we should assume that patch level branches, as well LTS in very strict
state, are handled manually.


  Creation of new branches:
We would branch more or less as usual. The only difference from the current
system is that we would not need the back merge / soft branching 

Re: [Development] HEADS-UP: Branching from '5.12' to '5.12.1' started

2019-01-23 Thread Kai Koehne
> -Original Message-
> From: Development  On Behalf Of ekke
> Sent: Wednesday, January 23, 2019 9:45 AM
> To: development@qt-project.org
> Subject: Re: [Development] HEADS-UP: Branching from '5.12' to '5.12.1' started
> 
> seems this time it's a rocky path to release Qt 5.12.1

We're still struggling with a change in configure that made system library 
paths absolute - good for local installations, but problematic for the binary 
packages, which are supposed to work in different setups:

 https://bugreports.qt.io/browse/QTBUG-72903

Kai

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Thiago Macieira
On Wednesday, 23 January 2019 07:25:44 PST Jason H wrote:
> > From: "Arnaud Clère" 
> > 
> > > And I don't want to add QUtf8String until SG16's char8_t gets settled.
> > > It'll probably be settled by C++20, which means we can probably work on
> > > this during Qt 6 lifetime, possibly even 6.1 or 6.2.> 
> > It makes sense to avoid future incompatibilities with the standard but
> > fortunately Qt sometimes chooses to solve real problems ahead in time 
> > ;-)
> Well C++20 is really how many months away? Qt6 won't be released until when?

Give me the exact answers and I'll tell you if we can have this in Qt 6.0.

The fact you can't is the problem: they're too much in flux and too close to 
each other for us to be able to accept char8_t as an established functionality 
that won't change by a later paper and design a solution for Qt 6.0. If we're 
lucky, we can do it. More likely, we'll have to wait a bit, possibly even for 
a compiler to implement it.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


[Development] Proposal: New branch model

2019-01-23 Thread Jedrzej Nowacki
Hi,

  It is time to rethink our branch model. We are approaching Qt6 development 
and I'm worried that what we have now, will simply not scale. As you know, our 
branch model is mainly(*) based on merging from stable up to development 
branches. In general, it is a very good model, which make sure that release 
branches are not getting obsolete too quickly, that most of the fixes are in 
the right places, that every commit is only once in the git history. It is a 
very clean model. It is also a very slow model, with a single point of 
failure.

  It is hard to maintain:
  My impression is that the current model works great for a single release 
branch, but we have: 5.6 5.9 5.12 and soon we will have 5.13, that is without 
counting temporary patch level branches. Merging through them is hard, we 
already had to use "cherry-pick mode" in some places. When we add to the 
picture dev and qt6 branches, merging will be very, very hard. It is 
practically a full time job to update qt5 repository and coordinate all the 
merges now (thank you Liang!), shortly after qt6 branch opening amount of work 
will be much bigger.
  It is slow:
  The merges take time. We produce a lot of code, we have a lot of tests that 
needs to pass. Even single failure delays merge propagation by at least one 
day. If by bad luck, the merge contains some API incompatible changes an 
intermediate jump through Qt5 integration is required, that adds at least 3 
days of delay.

  --

  Proposal in short: let's use cherry-pick mode everywhere.

  All(**) changes would go to dev. From which they would be automatically 
cherry-picked by a bot to other branches. The decision to which branch cherry-
pick, would be taken based on a tag in the commit message. We could add a 
footer that marks the change risk level as in quip-5 
(http://quips-qt-io.herokuapp.com/quip-0005.html), so for example "dev", 
"stable", "LTS". By 
default everything would be cherry-picked to qt6 branch unless "no-future" tag 
would be given. Of course we can bike-shed about the tag names.

  Advantages:
  - no waiting for merges, a fix can be used right away after integration
  - faster propagation of fixes targeting all branches, as there are no merges 
of merges
  - simpler for new contributors, always push to dev
  - in most cases a reviewer does no need to know what the current version 
number is
  - conflict resolution is distributed and affects mostly the author of the 
change
  - documents a change intent, which may be useful for people keeping own 
forks
  - over time with increased amount of conflicts old branches, in natural way, 
stay untouched

  Disadvantages:
  - git history would be a bit wilder, "git branch --contains" would not work
  - commit messages in some branches would have kind of ugly footer as an 
effect of "cherry-pick -x"
  - there is a chance, that some cherry-picked commits may be left forgotten 
in gerrit after a failed integration

  Bot details:

  The bot would listen only for changes in dev, in some unusual cases one 
could target an other branch directly, but bot would not trigger automatic 
cherry-pick(***). The bot would wait for a successful dev integration before 
creating cherry-picked changes. The bot would use cherry-pick -x to annotate 
the origin patch. After the cherry-pick creation, it would push it to gerrit, 
+2 and stage once. It would be up to the author to re-stage in case of 
flakiness. In case of a cherry-pick conflict it should push unresolved conflict 
to gerrit and add all reviewers and author to handle the issue.

The list below shows branch targets for automatic cherry-pick based on given 
tag.

dev (qt6)
stable (qt6, 5.13)
stable, no-future (5.13)
LTS (qt6, 5.13, 5.12)
LTS-strict (qt6, 5.13, 5.12, 5.9)
LTS, no-future (5.13, 5.12)

That is assuming that we have branches: qt6  dev  5.13  5.13.q  5.12  5.12.w  
5.9  5.9.e  5.6  5.6.r
I think we should assume that patch level branches, as well LTS in very strict 
state, are handled manually.


  Creation of new branches:
We would branch more or less as usual. The only difference from the current 
system is that we would not need the back merge / soft branching anymore, but 
of course we could keep it.



  Why not X instead?
  --

  - GitFlow, GitHub <= both are based on feature branches, that doesn't work 
well with gerrit.
  - Stay with the current solution <= the merge effort is too big and qt6 is 
expected to cause conflicts that really should not be solved by one person

  What to do around Qt6 release?
  --

  Replace dev branch with qt6 branch content, do not use "no-future" tag 
anymore. From a random contributor perspective nothing changes.

  Can we use annotate instead of cherry-pick -x?
  --

  No, but we should use it in addition. 

Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Thiago Macieira
On Wednesday, 23 January 2019 05:53:00 PST Edward Welbourne wrote:
> What are our chances of getting this right in Qt 6 ?

Not bad. But what you described is what SG16 is working on for std::text. So 
let's not do something different from them. We can prototype it and be first, 
though.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Jason H
> From: "Arnaud Clère" 
> > And I don't want to add QUtf8String until SG16's char8_t gets settled. 
> > It'll probably be settled by C++20, which means we can probably work on 
> > this during Qt 6 lifetime, possibly even 6.1 or 6.2.
> 
> It makes sense to avoid future incompatibilities with the standard but 
> fortunately Qt sometimes chooses to solve real problems ahead in time  ;-)

Well C++20 is really how many months away? Qt6 won't be released until when? It 
seems like both of these might land at the same time, except that the "by 
C++20" is (AFAICT) speculation. Uptake will also be slow. But by Qt being first 
we can get experience with the nature of the solution which might help inform 
the standard, or vice-versa. There's a risk we do something that conflicts with 
the standard in a useful way that people like, then we have fragmentation. 

Far smarter people than I have worked on this, so again burn this with fire, 
but my current thinking is: 
I think the problem is how all these things are implemented - they are 
basically escape codes, so it's impossible to say where thee current character 
ends and the next begins. This of course kills speed, but that's what we get 
for having more than one language on the planet plus emojis. It seems to me 
that the only real solution to keep it all fast is to progressively upgrade 
from bytes to the widest character and use that. This will have a scanning cost 
when it enters the address space if not denoted to the compiler or by the load 
method.  If memory is a concern, the only alternative I see is to create a 
complex string: "strings" are now arrays of character arrays of uniform width, 
and hope that it is only ever one:
"Ground control to Major Tom" - single sequence of 8 bit chars, len 27 size 27
"niños." encoded as 3 "strings", total length 6, size 7:
+ "ni" - "ni" (8 bit char sequence of 2 char)
+ "ñ" -  0001 (UTF16 16 bit char sequence of 1 char)
+ "os." - "o" (8 bit char sequence of 3 char)

In the old days BASIC, I forget which one, but I'm remembering a Dr Dobbs or 
some other print medium (over 20 years ago), I read BASIC stores strings as a 
linked list of characters, I'm adapting that idea. There are many tradeoffs, 
but until we're ok with 32 bit characters, there will be tradeoffs on a 
multi-language planet. 

I just don't think escape codes should ever be stored in memory. Disk is fine. 

"Better to remain silent and be thought a fool than to speak and to remove all 
doubt." - (Disputed). I think I may have broken that rule here. "Please, be 
gentle." - Peter Venkman

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Konstantin Tokarev


23.01.2019, 16:55, "Edward Welbourne" :
> All of this discussion ignores a major elephant: QString's indexing is
> by 16-bit UTF-16 tokens, not by Unicode characters. We've had Unicode
> for a couple of decades now.
>
> We *should* have a string type (I don't care what you call it) that acts
> on strings indexed by Unicode characters, not in terms of a
> representation. Whether that string type internally uses UTF-16 or
> UTF-8 should be invisible to its user. Ideally it would be capable of
> carrying its data internally in either form (so as to avoid needless
> conversion when both producer and consumer use the same form) and of
> converting between the two (e.g. so as to append efficiently) as needed.

I think this is excessive. Most common operations with strings in application
code are:
* Pass the string around or compare as an opaque token
* Draw the string on screen e.g. with QPainter (while technically it falls in 
the
previous category, I think it's important enough to deserve separate item)
* Find substring or pattern (regex) inside the string
* Split the string by character, pattern, or index boundaries found by means
of previous item

I think the only common cases when dealing with Unicode grapheme clusters
is required are
* Handling of text cursor movement
* Implementation of text shaping, i.e. what Harfbuzz is doing

I think having special iterator would be quite enough for cursor case. Such
iterator could abstract away underlying encoding, instead of forcing everyone
to convert to UTF-16 first.

>
> Meanwhile, buffers of data (whether 8-bit, 16-bit or of other sizes) are
> types we do need in diverse places - but they should be described
> differently from the sting type (call it a "text" type, if hysterical
> reasons oblige us to use "string" for its encoding). They can be
> interpreted as strings, hence can serve as backing-store for a string,
> provided they respect the relevant rules of a relevant encoding.
>
> If blob[index] always returns a Unicode *character*, then blob is a
> string; if it can sometimes return one half of a UTF-16 surrogate pair
> (as is the case with QString today) or one byte of a multi-byte UTF-8
> chunk, then blob is not really a string, it's just the storage for an
> encoding of a string.
>
> What are our chances of getting this right in Qt 6 ?
> It's the 21st century - way past time we did this,
>
> Eddy.
> ___
> Development mailing list
> Development@qt-project.org
> https://lists.qt-project.org/listinfo/development

-- 
Regards,
Konstantin

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Marco Bubke
I am not sure it would be a good idea because a glyph can be still composed of 
more than one code points which is language dependent. Some time you want 
characters, sometimes code points and sometimes glyphs etc.. Would it be not 
better to use a simple container and then functions on top which use a view, so 
we could use them with any container. So we would avoid any allocations for 
transforming characters from one to the other container. But anyway I think 
there are many usages for strings that one class to tackle all this problems is 
not enough.


From: Development  on behalf of Edward 
Welbourne 
Sent: Wednesday, January 23, 2019 2:53:00 PM
To: Arnaud Clère; Thiago Macieira
Cc: development@qt-project.org
Subject: Re: [Development] Qt6: Adding UTF-8 storage support to QString

All of this discussion ignores a major elephant: QString's indexing is
by 16-bit UTF-16 tokens, not by Unicode characters.  We've had Unicode
for a couple of decades now.

We *should* have a string type (I don't care what you call it) that acts
on strings indexed by Unicode characters, not in terms of a
representation.  Whether that string type internally uses UTF-16 or
UTF-8 should be invisible to its user.  Ideally it would be capable of
carrying its data internally in either form (so as to avoid needless
conversion when both producer and consumer use the same form) and of
converting between the two (e.g. so as to append efficiently) as needed.

Meanwhile, buffers of data (whether 8-bit, 16-bit or of other sizes) are
types we do need in diverse places - but they should be described
differently from the sting type (call it a "text" type, if hysterical
reasons oblige us to use "string" for its encoding).  They can be
interpreted as strings, hence can serve as backing-store for a string,
provided they respect the relevant rules of a relevant encoding.

If blob[index] always returns a Unicode *character*, then blob is a
string; if it can sometimes return one half of a UTF-16 surrogate pair
(as is the case with QString today) or one byte of a multi-byte UTF-8
chunk, then blob is not really a string, it's just the storage for an
encoding of a string.

What are our chances of getting this right in Qt 6 ?
It's the 21st century - way past time we did this,

Eddy.
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Edward Welbourne
All of this discussion ignores a major elephant: QString's indexing is
by 16-bit UTF-16 tokens, not by Unicode characters.  We've had Unicode
for a couple of decades now.

We *should* have a string type (I don't care what you call it) that acts
on strings indexed by Unicode characters, not in terms of a
representation.  Whether that string type internally uses UTF-16 or
UTF-8 should be invisible to its user.  Ideally it would be capable of
carrying its data internally in either form (so as to avoid needless
conversion when both producer and consumer use the same form) and of
converting between the two (e.g. so as to append efficiently) as needed.

Meanwhile, buffers of data (whether 8-bit, 16-bit or of other sizes) are
types we do need in diverse places - but they should be described
differently from the sting type (call it a "text" type, if hysterical
reasons oblige us to use "string" for its encoding).  They can be
interpreted as strings, hence can serve as backing-store for a string,
provided they respect the relevant rules of a relevant encoding.

If blob[index] always returns a Unicode *character*, then blob is a
string; if it can sometimes return one half of a UTF-16 surrogate pair
(as is the case with QString today) or one byte of a multi-byte UTF-8
chunk, then blob is not really a string, it's just the storage for an
encoding of a string.

What are our chances of getting this right in Qt 6 ?
It's the 21st century - way past time we did this,

Eddy.
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] Qt6: Adding UTF-8 storage support to QString

2019-01-23 Thread Arnaud Clère
> -Original Message-
> From: Thiago Macieira  
>
> On Tuesday, 22 January 2019 09:01:16 PST Arnaud Clère wrote:
> > QByteArray is the official way to deal with utf8 strings but:
> > 1. This discussion shows it is not as known as it should be and I 
> > argue the name does not help 2. Dealing with binary data and all kind 
> > of string encodings in a single class is error-prone
>
> And yet that's what we used to have in Qt 3 (remember QCString?). We went 
> away from it for a reason.

Sorry no, I never used Qt3. I just googled it looking for problems and only 
found ones that should be solved now by QByteArray:
- explicit sharing
- bad performance due to append() being O(length()) since it scans for a null 
terminator

> And 3: some character-mutating operations in QByteArray (toUpper, etc.) are 
> Latin1, not UTF-8.

A QUtf8String could override toUpper() and toLower() which are unfortunate if 
QByteArray really is the official way to deal with utf-8 strings...

> > Hence my suggestion of adding a QUtf8String deriving from QByteArray...
> Not likely to happen. If we add a QUtf8String, it will be like QLatin1String, 
> which in turn was meant to be similar to QStringView, not like QString. That 
> means no mutation and no owning memory.

The use case I am talking about is really a mutable utf8 container, even though 
it could provide a QUtf8StringLiteral macro similar to QByteArrayLiteral. I do 
not understand why a QUtf8String should necessarily be like a QLatinString.

OTOH, I would love to be able to manipulate QLatin1String/QUtf8String with a 
QStringView when dealing with possibly non-ASCII content. But QStringView seems 
to require knowing the number of remaining Unicode characters in constant time 
so I guess it is out of question...

> And I don't want to add QUtf8String until SG16's char8_t gets settled. It'll 
> probably be settled by C++20, which means we can probably work on this during 
> Qt 6 lifetime, possibly even 6.1 or 6.2.

It makes sense to avoid future incompatibilities with the standard but 
fortunately Qt sometimes chooses to solve real problems ahead in time  ;-)



___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


[Development] Qt infra "weekly" report

2019-01-23 Thread Tony Sarajärvi
Hi

State of the CI

  *   Currently all OK again. We had issues with Coin update breaking the code 
that re-runs flaky tests, but that was caught and fixed. Then we also had 
Google changing the way Android’s SDK manager is used, so we had to create a 
patch for all branches. 5.12 is at the time of writing fixed, and dev is being 
built currently.
  *   The SSDs were installed in the Compellent, so we can proceed with a few 
minor changes we had planned. Now we shouldn’t have to use the excuse that we 
run out of disk space as long as we grow the disks on the servers 
  *   Last evening we moved the third big set of infra over to the new 
firewall. After a few hours of down time the new firewall picked up the load. 
This is visible to the users in such a way that our public IP address changed 
for those under the new firewall.

Cheers!
-Tony
___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development


Re: [Development] HEADS-UP: Branching from '5.12' to '5.12.1' started

2019-01-23 Thread ekke
seems this time it's a rocky path to release Qt 5.12.1

Am 08.01.19 um 08:32 schrieb Jani Heikkinen:
> Hi all,
>
> Branching from '5.12' -> '5.12.1' is finally (almost) done; qtbase and 
> qttools is  still ongoing, there were some conflicts and so on manual merge + 
> normal integration is needed there. Work is ongoing and should be ready 
> today. 
>
> So from now on '5.12.1' is for soon coming Qt 5.12.1 release and staging 
> there is restricted to release team only. And '5.12' is for Qt 5.12.2.
>
> Target is to get Qt 5.12.1 out as soon as possible. We will create initial 
> changes files soon so please finalize those immediately to be able to proceed 
> with the release. Release blocker list here: 
> https://bugreports.qt.io/issues/?filter=20430 Please make sure all release 
> blockers are visible in the list!
>
> We will release first snapshot for Qt 5.12.1 during this week as well
>
> br,
> Jani
> 
> From: Development  on behalf of Jani 
> Heikkinen 
> Sent: Monday, January 7, 2019 7:37 AM
> To: Aleksey Kontsevich; ekke; developm...@lists.qt-project.org
> Subject: Re: [Development] HEADS-UP: Branching from '5.12' to   '5.12.1'  
>   started
>
>> -Original Message-
>> From: Development  On Behalf Of
>> Aleksey Kontsevich
>> Sent: lauantai 5. tammikuuta 2019 12.23
>> To: ekke ; developm...@lists.qt-project.org
>> Subject: Re: [Development] HEADS-UP: Branching from '5.12' to '5.12.1'
>> started
>>
>> Hi,
>>
>> What about QTBUG-72687 and QTBUG-72227?
>>
> QTBUG-72687 or QTBUG-72227 aren't really a release blocker, sorry.
>
> br,
> Jani
> ___
> Development mailing list
> Development@qt-project.org
> https://lists.qt-project.org/listinfo/development
> ___
> Development mailing list
> Development@qt-project.org
> https://lists.qt-project.org/listinfo/development

___
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development