Re: [GSoC] IUri Implementations

2010-04-08 Thread Jacek Caban

Hi Thomas,

On 4/8/10 3:43 AM, Thomas Mullaly wrote:


In general, the idea looks right, that's how it probably should be
implemented. This is an implementation detail through. The bigger and
more important problem is URI parsing and canonicalization. That's
where
most of work needs to be done. In this case tests will be also very
important. You don't know how it should work until you have a
test. The
first step would be to write a test infrastructure some tests (adding
new test shouldn't be harder than filling a table with more data).
Once
it's done, you'll be able to decide on best way to implement
parser and
IUri interface. The project should result in many tests and good
support
for at least more useful flags and IUri functions.


Hi Jacek,

Sorry for my delayed response. Thank you for your suggestions.

For the testing infrastructure, I was thinking about writing a few 
Windows programs that use Microsoft's IUri implementation to generate 
the results that my testing infrastructure would use to make sure my 
implementation is working correctly. Is this the right approach or 
would you recommend doing it another way?


Test should be integrated with Wine tests. See dlls/shlwapi/tests/url.c 
and dlls/wininet/tests/url.c for an idea of how it should be done.


Also, I have finished a rough draft of my proposal and I was wondering 
if it would appropriate to post to it to the mailing list in order to 
receive feedback from you and others.


If you have specific questions, feel free to ask here. Proposal itself 
should be posted to gsoc app. It's capable of editing proposals and 
getting feedback.



Jacek



Re: [GSoC] IUri Implementations

2010-04-08 Thread Thomas Mullaly




Hi Jacek,
Test
should be integrated with Wine tests. See dlls/shlwapi/tests/url.c
and dlls/wininet/tests/url.c for an idea of how it should be done.

Thank you, shlwapi's url test was very helpful and has given me some
good ideas for testing.

If
you have specific questions, feel free to ask here. Proposal itself
should be posted to gsoc app. It's capable of editing proposals and
getting feedback.


Here's the link to my proposal
http://socghop.appspot.com/gsoc/student_proposal/show/google/gsoc2010/thomas_mullaly/t127066435448
I would greatly appreciate it if anybody would check it out and leave
feedback/suggestions for it.







Re: [GSoC] IUri Implementations

2010-04-07 Thread Thomas Mullaly

 In general, the idea looks right, that's how it probably should be
  implemented. This is an implementation detail through. The bigger and
 more important problem is URI parsing and canonicalization. That's where
 most of work needs to be done. In this case tests will be also very
 important. You don't know how it should work until you have a test. The
 first step would be to write a test infrastructure some tests (adding
 new test shouldn't be harder than filling a table with more data). Once
 it's done, you'll be able to decide on best way to implement parser and
 IUri interface. The project should result in many tests and good support
 for at least more useful flags and IUri functions.


Hi Jacek,

Sorry for my delayed response. Thank you for your suggestions.

For the testing infrastructure, I was thinking about writing a few Windows
programs that use Microsoft's IUri implementation to generate the results
that my testing infrastructure would use to make sure my implementation is
working correctly. Is this the right approach or would you recommend doing
it another way?

Also, I have finished a rough draft of my proposal and I was wondering if it
would appropriate to post to it to the mailing list in order to receive
feedback from you and others.



Re: [GSoC] IUri Implementations

2010-03-31 Thread Jacek Caban
Hi Thomas,

On 03/31/2010 04:15 AM, Thomas Mullaly wrote:

 You could use dynamic array for that or a list with a Uri_PROPERTY
 value as a key for example and a data as an offset and length.
 Another way is to compute each property offset only when it's
 requested and store it. An obvious bad case for that is a long uri.
 So probably one pass property computation while building IUri
 instance is not bad.

 I like the idea of making a lightweight data structure which stores
 the offset and length for each component property. I'd imagine it
 would look something like this:

 typedef struct  {
DWORD offset;
DWORD length;
 } UriComponent;

 Although it becomes a little more tricky on how to store the
 UriComponents, but, I have a few ideas if anyone has any suggestions.

 I do like the idea of using an array inside the Uri struct to store
 the UriComponents but not all of the values in the Uri_PROPERTY enum
 actually mean anything (at least thats what I have gathered from
 reading the MSDN docs), like the  Uri_PROPERTY_STRING_START and the
 Uri_PROPERTY_STRING_LAST are just there to say all the enum values
 between  = START and  = LAST correspond the string components of the
 URI.

 So I'm thinking the Uri struct should have a constant size array of
 UriComponents of length Uri_PROPERTY_STRING_LAST (which would be 15..
 correct me if I'm wrong).

 So it would look something like...

 typedef struct {
/** The other stuff */

BSTR *uri;
UriComponents components[15];
 } Uri;

 and then for the GetPropertyBSTR(BSTR *component, Uri_PROPERTY prop)
 function you could just have something like.

 if(prop = Uri_PROPERTY_START  prop = Uri_PROPERTY_LAST) {
UriComponent comp;
comp = uri-components[prop];

/** Parse the component out */
 }

 And that should get you the necessary offsets and lengths for the
 component you need.

 I also like the idea suggested before using a one-pass solution to
 find everything when the Uri is constructed.


 Thank you for the quick responses and suggestions, I hope to have a
 proposal ready in the next few days.

In general, the idea looks right, that's how it probably should be
implemented. This is an implementation detail through. The bigger and
more important problem is URI parsing and canonicalization. That's where
most of work needs to be done. In this case tests will be also very
important. You don't know how it should work until you have a test. The
first step would be to write a test infrastructure some tests (adding
new test shouldn't be harder than filling a table with more data). Once
it's done, you'll be able to decide on best way to implement parser and
IUri interface. The project should result in many tests and good support
for at least more useful flags and IUri functions.

Thanks,
Jacek




[GSoC] IUri Implementations

2010-03-30 Thread Thomas Mullaly
Hi, my name is Thomas Mullaly and I am a undergraduate in the Computer 
Science department at Kent State University and I would very much like 
to participate in this years GSoC. I saw under your project ideas page 
that the IUri API still needs implemented and I thought that this would 
be a good project for me, but, before I submit a proposal on it I have a 
few questions about the project itself.


Firstly, on the project page it says that the main goal is to have the 
IUri interface and CreateUri function implemented, but, on MSDN they 
also have functions and interfaces for creating/manipulating 
IUriBuilder's and I was wondering if these were also part of the project 
goals. If not can they be or would this be to ambitious to have finished 
by the end of the summer.



Secondly (more of a design question), I see that the Uri structure and 
functions are already stubbed out in the dlls/urlmon/uri.c file and I 
was thinking for my implementation I would add another BSTR* member to 
the Uri struct, which will point to the encoded version of the URI 
(which will be generated during the CreateUri() call). Since most of the 
functions that interact with the IUri return components of the URI (e.g. 
scheme, host, query, etc.) I was thinking about adding more data members 
to the Uri struct which store the location in the encoded Uri string 
where each component exists (or -1 if it does not exist) and by doing 
this the runtimes of the IUri functions will be reduced since the 
function will already know where to look inside the encoded string for 
the component it needs. A drawback to this design is that each Uri 
struct will be bloated with a decent amount of ints which may or may not 
be used depending on the type of the URI that the IUri represents. The 
second approach I was thinking of is to not store any locations inside 
the Uri struct and to compute them on the fly every time the IUri is 
queried for one of its components, this would result in a smaller memory 
footprint of the Uri structure but will increase the runtimes of all the 
functions that access the URI. I was wondering if anyone might have 
suggestions for which way they think might be better.


Any input will be greatly appreciated!


-Thomas Mullaly




Re: [GSoC] IUri Implementations

2010-03-30 Thread Nikolay Sivov

On 3/31/2010 02:57, Thomas Mullaly wrote:

Hi, my name is Thomas Mullaly and I am a undergraduate in the Computer 
Science department at Kent State University and I would very much like 
to participate in this years GSoC. I saw under your project ideas page 
that the IUri API still needs implemented and I thought that this 
would be a good project for me, but, before I submit a proposal on it 
I have a few questions about the project itself.

Hi, Thomas, and welcome.


Firstly, on the project page it says that the main goal is to have the 
IUri interface and CreateUri function implemented, but, on MSDN they 
also have functions and interfaces for creating/manipulating 
IUriBuilder's and I was wondering if these were also part of the 
project goals. If not can they be or would this be to ambitious to 
have finished by the end of the summer.
Right, a complete IUri with corresponding tests will be enough for a 
summer project I think. After a brief look at IUriBuilder I think it 
doesn't depend on a IUri implementation details so much. For IUriBuilder 
one way I see is to track changed properties and store only new data, 
using unchanged properties from supplied IUri, but this needs some tests 
(does it keep reference for IUri for example or not).



Secondly (more of a design question), I see that the Uri structure and 
functions are already stubbed out in the dlls/urlmon/uri.c file and 
I was thinking for my implementation I would add another BSTR* member 
to the Uri struct, which will point to the encoded version of the URI 
(which will be generated during the CreateUri() call). Since most of 
the functions that interact with the IUri return components of the URI 
(e.g. scheme, host, query, etc.) I was thinking about adding more data 
members to the Uri struct which store the location in the encoded Uri 
string where each component exists (or -1 if it does not exist) and by 
doing this the runtimes of the IUri functions will be reduced since 
the function will already know where to look inside the encoded string 
for the component it needs. A drawback to this design is that each Uri 
struct will be bloated with a decent amount of ints which may or may 
not be used depending on the type of the URI that the IUri represents. 
The second approach I was thinking of is to not store any locations 
inside the Uri struct and to compute them on the fly every time the 
IUri is queried for one of its components, this would result in a 
smaller memory footprint of the Uri structure but will increase the 
runtimes of all the functions that access the URI. I was wondering if 
anyone might have suggestions for which way they think might be better.
You could use dynamic array for that or a list with a Uri_PROPERTY value 
as a key for example and a data as an offset and length. Another way is 
to compute each property offset only when it's requested and store it. 
An obvious bad case for that is a long uri. So probably one pass 
property computation while building IUri instance is not bad.


Waiting for Jacek comments.


Any input will be greatly appreciated!


-Thomas Mullaly









Re: [GSoC] IUri Implementations

2010-03-30 Thread Thomas Mullaly


You could use dynamic array for that or a list with a Uri_PROPERTY 
value as a key for example and a data as an offset and length. Another 
way is to compute each property offset only when it's requested and 
store it. An obvious bad case for that is a long uri. So probably one 
pass property computation while building IUri instance is not bad.


I like the idea of making a lightweight data structure which stores the 
offset and length for each component property. I'd imagine it would look 
something like this:


typedef struct  {
   DWORD offset;
   DWORD length;
} UriComponent;

Although it becomes a little more tricky on how to store the 
UriComponents, but, I have a few ideas if anyone has any suggestions.


I do like the idea of using an array inside the Uri struct to store the 
UriComponents but not all of the values in the Uri_PROPERTY enum 
actually mean anything (at least thats what I have gathered from reading 
the MSDN docs), like the  Uri_PROPERTY_STRING_START and the 
Uri_PROPERTY_STRING_LAST are just there to say all the enum values 
between  = START and  = LAST correspond the string components of the URI.


So I'm thinking the Uri struct should have a constant size array of 
UriComponents of length Uri_PROPERTY_STRING_LAST (which would be 15.. 
correct me if I'm wrong).


So it would look something like...

typedef struct {
   /** The other stuff */

   BSTR *uri;
   UriComponents components[15];
} Uri;

and then for the GetPropertyBSTR(BSTR *component, Uri_PROPERTY prop) 
function you could just have something like.


if(prop = Uri_PROPERTY_START  prop = Uri_PROPERTY_LAST) {
   UriComponent comp;
   comp = uri-components[prop];

   /** Parse the component out */
}

And that should get you the necessary offsets and lengths for the 
component you need.


I also like the idea suggested before using a one-pass solution to find 
everything when the Uri is constructed.



Thank you for the quick responses and suggestions, I hope to have a 
proposal ready in the next few days.