Re: [Development] New QUrl implementation

2012-01-23 Thread lars.knoll
On 1/23/12 6:47 PM, "ext David Faure"  wrote:

>On Wednesday 28 December 2011 16:43:10 David Faure wrote:
>> On Tuesday 27 December 2011 12:24:23 Thiago Macieira wrote:
>> > On Tuesday, 27 de December de 2011 13.42.10, David Faure wrote:
>> > > On Friday 23 December 2011 17:22:38 Thiago Macieira wrote:
>> > > > QUrl: Always lowercase the scheme
>> > > > Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e
>> > > > 
>> > > > Do we want this? I think so.
>> > > 
>> > > I would say yes, too. This simplifies the checks in application
>>code.
>> > > 
>> > > PS: great work on QUrl!
>> > > 
>> > > Isn't this going to go to gerrit for review before merging? I'm
>> > > surprised
>> > > that this is in gitorious branches for now, but I suppose that's
>>because
>> > > topic branches don't really work there iirc.
>> > 
>> > I'm not allowed to submit to Gerrit. I can only publish as open
>>source.
>> 
>> Hmm, so what now? How do we get this into Qt?
>> 
>> (I suspect that this is a touchy legal issue I probably shouldn't dig
>>into,
>> but this has a huge effect on kde frameworks 5, so I need to know if
>>this is
>> going to be solved...)
>
>After more discussion with Thiago, it turns out that Intel still hasn't
>signed 
>the CLA that would allow him to contribute to Qt, which is why this work
>is 
>blocked at the moment.
>
>It looks like this might take a while still, but we really need these
>changes 
>into Qt. If this is delayed too much, we won't be able to get it in at
>all, 
>after the SC freeze.
>
>So I have a proposal to make: how about I make the API and behavior
>incompatible changes now, while there's still time, so that later on
>Thiago 
>can still commit his work even after the api freeze?
>
>In concrete terms, the plan would be that I change the non-encoded
>methods, 
>like toString and the QUrl(QString) constructor, so that they work on
>partially-encoded data.
>This would fix the long-standing issue that toString() is basically
>unusable, 
>except for displaying to the user in non-editable format. If a file
>contains a 
>'#' in the name, toString doesn't escape it, so if that string is given
>back 
>to a QUrl, it won't be parsed correctly.
>Thiago's code rewrites the whole qurl internals, but as a short-term
>solution 
>to get the API and behavior in place, I could just drop the QString
>members in 
>the current Private, and pass the encoded form through some "pretty
>decoding" 
>code. This would be slower, but the behavior would be fixed, and we can
>make it 
>faster later.
>
>After that we can deprecate all the "encoded" methods, but that in itself
>doesn't need to be done before the freeze, does it? It's SC/BC
>technically, 
>like any deprecation that will happen for 5.1 or later.
>
>I would prefer not to have to do this redundant work, but it seems like a
>better solution than having to live with Qt4's QUrl for the whole life
>time of 
>Qt 5 (which also means keeping a KDE layer on top, to fix these issues,
>but I 
>really want to get rid of that).

Sounds ok to me, even though I hate seeing the duplicated work happen.

Lars

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2012-01-23 Thread David Faure
On Wednesday 28 December 2011 16:43:10 David Faure wrote:
> On Tuesday 27 December 2011 12:24:23 Thiago Macieira wrote:
> > On Tuesday, 27 de December de 2011 13.42.10, David Faure wrote:
> > > On Friday 23 December 2011 17:22:38 Thiago Macieira wrote:
> > > > QUrl: Always lowercase the scheme
> > > > Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e
> > > > 
> > > > Do we want this? I think so.
> > > 
> > > I would say yes, too. This simplifies the checks in application code.
> > > 
> > > PS: great work on QUrl!
> > > 
> > > Isn't this going to go to gerrit for review before merging? I'm
> > > surprised
> > > that this is in gitorious branches for now, but I suppose that's because
> > > topic branches don't really work there iirc.
> > 
> > I'm not allowed to submit to Gerrit. I can only publish as open source.
> 
> Hmm, so what now? How do we get this into Qt?
> 
> (I suspect that this is a touchy legal issue I probably shouldn't dig into,
> but this has a huge effect on kde frameworks 5, so I need to know if this is
> going to be solved...)

After more discussion with Thiago, it turns out that Intel still hasn't signed 
the CLA that would allow him to contribute to Qt, which is why this work is 
blocked at the moment.

It looks like this might take a while still, but we really need these changes 
into Qt. If this is delayed too much, we won't be able to get it in at all, 
after the SC freeze.

So I have a proposal to make: how about I make the API and behavior 
incompatible changes now, while there's still time, so that later on Thiago 
can still commit his work even after the api freeze?

In concrete terms, the plan would be that I change the non-encoded methods, 
like toString and the QUrl(QString) constructor, so that they work on 
partially-encoded data.
This would fix the long-standing issue that toString() is basically unusable, 
except for displaying to the user in non-editable format. If a file contains a 
'#' in the name, toString doesn't escape it, so if that string is given back 
to a QUrl, it won't be parsed correctly.
Thiago's code rewrites the whole qurl internals, but as a short-term solution 
to get the API and behavior in place, I could just drop the QString members in 
the current Private, and pass the encoded form through some "pretty decoding" 
code. This would be slower, but the behavior would be fixed, and we can make it 
faster later.

After that we can deprecate all the "encoded" methods, but that in itself 
doesn't need to be done before the freeze, does it? It's SC/BC technically, 
like any deprecation that will happen for 5.1 or later.

I would prefer not to have to do this redundant work, but it seems like a 
better solution than having to live with Qt4's QUrl for the whole life time of 
Qt 5 (which also means keeping a KDE layer on top, to fix these issues, but I 
really want to get rid of that).

-- 
David Faure, fa...@kde.org, http://www.davidfaure.fr
Sponsored by Nokia to work on KDE, incl. KDE Frameworks 5

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2012-01-04 Thread Thiago Macieira
On Wednesday, 4 de January de 2012 19.10.39, Giuseppe D'Angelo wrote:
> On 4 January 2012 18:49,   wrote:
> > On 04/01/2012 10:48, ext Thiago Macieira wrote:
> >> Highlights:
> >>
> >> * EVERYTHING tested is faster
> >>(though unfortunately nothing reaches 1 robe)
> >
> > For the uninitiated, 1 robe = 10x performance increase, since Roberto
> > has the uncanny ability to speed up his parser performance by 10x every
> > time he does a rewrite :)
>
> Is it linear or logarithmic? :)

I advocated for a logarithmic scale, so that 2 robe equals to 100x faster, but
others have used it linearly.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2012-01-04 Thread Giuseppe D'Angelo
On 4 January 2012 18:49,   wrote:
> On 04/01/2012 10:48, ext Thiago Macieira wrote:
>> Highlights:
>>
>> * EVERYTHING tested is faster
>>    (though unfortunately nothing reaches 1 robe)
>
> For the uninitiated, 1 robe = 10x performance increase, since Roberto
> has the uncanny ability to speed up his parser performance by 10x every
> time he does a rewrite :)

Is it linear or logarithmic? :)

-- 
Giuseppe D'Angelo
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2012-01-04 Thread marius.storm-olsen
On 04/01/2012 10:48, ext Thiago Macieira wrote:
> Highlights:
>
> * EVERYTHING tested is faster
>(though unfortunately nothing reaches 1 robe)

For the uninitiated, 1 robe = 10x performance increase, since Roberto 
has the uncanny ability to speed up his parser performance by 10x every 
time he does a rewrite :)

-- 
.marius
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2012-01-04 Thread Thiago Macieira
On Wednesday, 4 de January de 2012 01.53.44, martin.jo...@nokia.com wrote:
> >From the QtDeclarative point of view, apart from having our tests pass, I'd
> >like to see some comparisons in QML load time, using photoviewer as a test
> >case, for example.
> Looks like performance was an important consideration in the new
> implementation, but I'd like to see confirmation :-)

I've run all the benchmarks I could on the photoviewer example. I managed to
crash my system twice while doing hardware event collection.

The impact of the QUrl changes in the example is: inconclusive

QUrl's contribution to the example is negligible in both cases. The number of
samples collected in QUrl functions is way too small to enable any kind of
insight. I cannot even conclude which functions were *called*.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden

signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2012-01-03 Thread martin.jones
>From the QtDeclarative point of view, apart from having our tests pass, I'd 
>like to see some comparisons in QML load time, using photoviewer as a test 
>case, for example.

Looks like performance was an important consideration in the new 
implementation, but I'd like to see confirmation :-)

Br,
Martin.

> -Original Message-
> From: development-bounces+martin.jones=nokia@qt-project.org
> [mailto:development-bounces+martin.jones=nokia@qt-project.org]
> On Behalf Of ext Thiago Macieira
> Sent: Wednesday, December 28, 2011 12:24 AM
> To: development@qt-project.org
> Subject: Re: [Development] New QUrl implementation
> 
> On Tuesday, 27 de December de 2011 13.42.10, David Faure wrote:
> > On Friday 23 December 2011 17:22:38 Thiago Macieira wrote:
> > > QUrl: Always lowercase the scheme
> > > Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e
> > >
> > > Do we want this? I think so.
> >
> > I would say yes, too. This simplifies the checks in application code.
> >
> > PS: great work on QUrl!
> >
> > Isn't this going to go to gerrit for review before merging? I'm
> > surprised that this is in gitorious branches for now, but I suppose
> > that's because topic branches don't really work there iirc.
> 
> I'm not allowed to submit to Gerrit. I can only publish as open source.
> 
> > What I would like to do with your new QUrl (after this week's
> > vacations) is to import it into kdelibs-frameworks, to port kurltest
> > to it and see what breaks. Or did you do that already locally? You
> > have my permission to import any tests from kurltests that I wrote,
> > into the qurl auto tests, of course.
> 
> No, I haven't. I've only run the QUrl-related tests in qtbase, so not even the
> QtDeclarative tests.
> 
> --
> Thiago Macieira - thiago.macieira (AT) intel.com
>   Software Architect - Intel Open Source Technology Center
>  Intel Sweden AB - Registration Number: 556189-6027
>  Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2011-12-28 Thread David Faure
On Tuesday 27 December 2011 12:24:23 Thiago Macieira wrote:
> On Tuesday, 27 de December de 2011 13.42.10, David Faure wrote:
> > On Friday 23 December 2011 17:22:38 Thiago Macieira wrote:
> > > QUrl: Always lowercase the scheme
> > > Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e
> > > 
> > > Do we want this? I think so.
> > 
> > I would say yes, too. This simplifies the checks in application code.
> > 
> > PS: great work on QUrl!
> > 
> > Isn't this going to go to gerrit for review before merging? I'm surprised
> > that this is in gitorious branches for now, but I suppose that's because
> > topic branches don't really work there iirc.
> 
> I'm not allowed to submit to Gerrit. I can only publish as open source.

Hmm, so what now? How do we get this into Qt?

(I suspect that this is a touchy legal issue I probably shouldn't dig into, 
but this has a huge effect on kde frameworks 5, so I need to know if this is 
going to be solved...)

-- 
David Faure | david.fa...@kdab.com | KDE/Qt Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, Sweden (HQ) +46-563-540090
KDAB - Qt Experts - Platform-independent software solutions

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2011-12-27 Thread Thiago Macieira
On Tuesday, 27 de December de 2011 13.42.10, David Faure wrote:
> On Friday 23 December 2011 17:22:38 Thiago Macieira wrote:
> > QUrl: Always lowercase the scheme
> > Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e
> >
> > Do we want this? I think so.
>
> I would say yes, too. This simplifies the checks in application code.
>
> PS: great work on QUrl!
>
> Isn't this going to go to gerrit for review before merging? I'm surprised
> that this is in gitorious branches for now, but I suppose that's because
> topic branches don't really work there iirc.

I'm not allowed to submit to Gerrit. I can only publish as open source.

> What I would like to do with your new QUrl (after this week's vacations) is
> to import it into kdelibs-frameworks, to port kurltest to it and see what
> breaks. Or did you do that already locally? You have my permission to
> import any tests from kurltests that I wrote, into the qurl auto tests, of
> course.

No, I haven't. I've only run the QUrl-related tests in qtbase, so not even the
QtDeclarative tests.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2011-12-27 Thread David Faure
On Friday 23 December 2011 17:22:38 Thiago Macieira wrote:
> QUrl: Always lowercase the scheme
> Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e
> 
> Do we want this? I think so.

I would say yes, too. This simplifies the checks in application code.

PS: great work on QUrl!

Isn't this going to go to gerrit for review before merging? I'm surprised that 
this is in gitorious branches for now, but I suppose that's because topic 
branches don't really work there iirc.

What I would like to do with your new QUrl (after this week's vacations) is to 
import it into kdelibs-frameworks, to port kurltest to it and see what breaks. 
Or did you do that already locally? You have my permission to import any tests 
from kurltests that I wrote, into the qurl auto tests, of course.

-- 
David Faure | david.fa...@kdab.com | KDE/Qt Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, Sweden (HQ) +46-563-540090
KDAB - Qt Experts - Platform-independent software solutions

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2011-12-23 Thread Thiago Macieira
On Friday, 23 de December de 2011 11.29.34, Thiago Macieira wrote:
> https://qt.gitorious.org/~thiago-intel/qt/thiago-intels-qtbase/commits/new-
> qurl

Convert tst_qurl.cpp from Latin-1 to UTF-8
Change-Id: If21658bd5e4af0fdcc403b054fc1b8f46b5fcfb0

This was a very, very old file, encoded in Latin 1. It was created before we
came to our senses. By the way, loading it in Qt Creator in Latin 1 mode
exposes a bug with dead keys on X11.

Move the QByteArray-based percent-encoding activities to QByteArray
Change-Id: I6639ea566d82dabeb91280177a854e89e18f6f8d

The new URL recoder does not operate on 8-bit, but on 16-bit. So all the 8-bit
percent-encoding functions and tests are moved to QByteArray and its unit
test. Previously, this commit deprecated QUrl::fromPercentEncoded and
QUrl::toPercentEncoded. Based on our discussion on the mailing list, I've un-
deprecated them.

Mark QUrl::{to,from}Punycode as deprecated since 5.0
Change-Id: I9b95b4bd07b4425344a5c6ef5cce7cfcb9846d3e

These functions are now aliases to {to,from}Ace, which are usually
what you want. The original functions from Qt 4.0 had the wrong
semantics and wrong name. The new ones from Qt 4.2 execute the ACE
processing from IDNA (specifically, the ToASCII and ToUnicode
operations described in the RFC).

Add the code that recodes URLs.
Change-Id: I997eccfd2f9ad8487305670b18d6c806f4cf6717
and
Remove the tolerant parsing function and make the recoder tolerant
Change-Id: I68a0c9fda6765de05914cbd6ba7d3cea560a7cd6

This is the core of the new QUrl functionality, which is the URL recoder. I
call it "recoder" because it doesn't encode or decode, it simply transforms
from different styles of encoding. As I expressed many moons ago, there is no
more "fully-decoded" mode.

It's in two commits because the first implementation was strict, with a
tolerant front-end. The new implementation is tolerant, which is the 99% use-
case. The strict mode, which I still haven't implemented, will be done by
doing checks on the data.

Long live QUrlQuery
Change-Id: Ia61096fe5060b486196ffb8532e7494eff58fec1

The new QUrlQuery class, which deals with the query in a URL. This allows QUrl
to be simpler and handle the query as one single string.

The next three commits deal with updating the recoder so that it's more
efficient with the use-case intended.

Then we have:
Change QUrlQuery to keep the order
Change-Id: Id4ca02130fd6235b32fb971a8b448211e609986a

As discussed previously, keeping the order is important. This commit
implements that.

I can squash the commit series above so we have only one adding the recoder
and only one adding QUrlQuery.

Then the work in QUrl starts:

Move some of qurl.cpp ino other files for ease of maintenance
Change-Id: I0b32c0bf0ee6c2f08dc3200c44af3c9d1504a3df

The parsing code is now in qurlparser.cpp, whereas the IDNA related
code is in qurlidna.cpp.

Remove the methods dealing with the break-down of the query from QUrl
Change-Id: I34820b3101424593d0715841a2057ac3f74d74f0

Long live the new QUrl implementation.
Change-Id: I6d340b19c1a11b98a48145152513ffec58fb3fe3

The main implementation

QUrl: Always lowercase the scheme
Change-Id: I8d467014d22384f1be15fdd746e20b1153a82a4e

Do we want this? I think so.

Then there are a couple more commits.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2011-12-23 Thread Thiago Macieira
On Friday, 23 de December de 2011 11.29.34, Thiago Macieira wrote:
> https://qt.gitorious.org/~thiago-intel/qt/thiago-intels-qtbase/commits/new-
> qurl-preparation

Move the UTF-8 data into a separate .cpp so I can use later
Icb909d08b7f8d0e1ffbc28e01a0ba0c1fa9dccf0

tst_utf8.cpp contains some UTF-8 compliancy tests. This commit only moves them
to a separate file I can include in QUrl unit tests. The reason is that the new
QUrl recoder does UTF-8 on its own, so I wanted to make sure I wasn't
introducing new security issues.

Disable QUrl support in QVariant in bootstrapped mode
Ie156b0817d119b2ba5d3dcb9712a9fea2ee7d4a1

None of the bootstrapped tools require QUrl, so disable it from
libbootstrap.a. This reduces the size of those tools because the IDN Unicode
tables and QUrl parsing code are not necessary inside.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] New QUrl implementation

2011-12-23 Thread Thiago Macieira
On Friday, 23 de December de 2011 11.29.34, Thiago Macieira wrote:
> https://qt.gitorious.org/~thiago-intel/qt/thiago-intels-
> qtbase/commits/ipaddress-parser

This series of commits does the following:

24976c8 Add a function to parse IPv4 addresses in QtCore

There are two kinds of IPv4 address parsers in the wild:
inet_aton-style
inet_pton-style

The inet_aton is the oldest implementation, which allows IP addresses like
127.1 or 10.0.257. Those types of addresses are not usually considered valid
and inet_pton will not accept them. The old implementation we had in
qhostaddress.cpp was compliant with inet_pton. And we had a different
implementation in qurl.cpp following the grammar set forth in RFC 3986.

The new implementation I wrote is actually more lax and accepts IP addresses
like inet_aton. The unit test will compare the results between the two to
ensure we're following it. The reason I did so was because URL parsing very
often deals with very broken URLs. The unit tests in WebKit in fact do attempt
such broken addresses.

013a083 Add support for IPv6 parsing and reconstructing the address

Unlike the IPv4 implementation, the IPv6 one is strict. It's in fact stricter
than the previous one in QHostAddress because it doesn't accept a double-colon
(::) where none is necessary. The following IP address was previously accepted
but no longer is:

::::::::

It parses v4-compat and v4-mapped addresses but requires strict IPv4 addresses
(inet_pton style, not inet_aton style). It also reconstructs those mapped
addresses properly, which the old QUrl didn't.

c5c1650 Port QHostAddress to use the new IP utilities in QtCore

Replace the old IP-address parsing in QHostAddress with the one in QtCore,
which is way more efficient. I didn't benchmark by how much, but anyone doing a
cursory comparison can tell.

63e0c52 Optimise QHostAddress a little

Some extra work to make it faster, avoiding parsing of constant strings.
Instead, simply use the known binary form.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


[Development] New QUrl implementation

2011-12-23 Thread Thiago Macieira
I've uploaded it at:

https://qt.gitorious.org/~thiago-intel/qt/thiago-intels-qtbase/commits/new-
qurl

It requires two more branches:

https://qt.gitorious.org/~thiago-intel/qt/thiago-intels-qtbase/commits/new-
qurl-preparation

https://qt.gitorious.org/~thiago-intel/qt/thiago-intels-
qtbase/commits/ipaddress-parser
--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development