Note that this email is sent to the Pharo mailing list. It is actually more appropriate since the moose mailing list members are also part of Pharo.

I have an update I will push in a day or two that extends namespace support to attributes and also preserves the order of attributes. Additionally, the tokenizer has been sped up somewhat. I have also renamed some of the SAX handlers, but if you have a subclass that implements them, they will still be invoked but with a warning that they have been deprecated.

Using deprecation is good

I am going back on using Symbols to identify elements and attributes. The performance of such a Symbol-based system is quite erratic; running benchmark1 several times (at a much higher interval), I repeatedly got results in the ~5800 to ~6200 range in a clean image with XML-Support loaded into it from Monticello. Saving the image and running it again, I got results in the range of ~4400 to ~4700. There seems to be a performance hit associated with using symbols in this fashion that does not abate until you save the image at least once. Even then my tests seem to indicate that a String- based system would be slightly faster. This is probably due to the initial overhead of interning Symbols and the subsequent overhead of looking them up.

Really strange. What make you think it is related to Symbol?

Cheers,
Alexandre


---- On Thu, 04 Mar 2010 12:35:39 -0800 Alexandre Bergel <alexan...@bergel.eu > wrote ----

Hi,

I exchanged a number of emails with Jaayer and Norbert regarding some
improvements of XMLSupport and its port to Gemstone.
It may be a bit difficult for people to follow this, but I think it is
important to not discuss privately.

I already changed

XMLTokenizer>>nextName
....
^ self fastStreamStringContents: nameBuffer

to

XMLTokenizer>>nextName
....
^ (self fastStreamStringContents: nameBuffer) asSymbol

in the gemstone parser to be more consistent.

Have you noticed any slow down for this?

No I didn't do any tests. But if internally all names are symbols
than I guess converting it while reading is the best way to do.

I added benchmark1 in XMLParserTest. Really simple way to measure
progress (or slowdown).
On my machine, I have:
XMLParserTest new benchmark1
=> 2097

Adding "(self fastStreamStringContents: nameBuffer) asSymbol"
increase the bench to 2273

I don't believe this ;) you read them as string from the stream. If
they are managed as symbols somehow they need to be converted. If
not at this place then on some other. I would suspect that there are
doubled calls to asSymbol. Could you check the sources?

There is indeed a slowdown. I am not sure where it comes from however.
Executing twice "XMLParserTest new benchmark1" does not return the
same result. Actually, it increases at each execution! I thought that
a garbage collect before running the bench would help, does apparently
it does not.

Calling asSymbol on a symbol should not be perceptible I believe.

Cheers,
Alexandre


anElement attributes class (I wrote species but that will fail in
gemstone too I guess. Hell!)

I just committed with species. Let me know. This is easy to adjust.

Ok.

The gemstone XML Parser decides somehow to use
IdentityDictionary internally. I think this should be allowed. I
just changed Dictionary to IdentityDictionary so the = test
reflects the right type. If you change it it is clear that it
fails. Because pharo uses Dictionary internally. So you might
see that it is not a question of using Dictionary or
IdentityDictionary but a question of the wrongness of using =

How the XMLNodeTest should look like to accommodate your situation?

I need to recheck this. The problem is really that the XML Parser
creatios instances of class Association but { #key->'value' }
creates an instance of class SymbolAssociation. That means I would
know how to fix the test but I want to understand the implications
of all of this. I'll get to you if I know anything new.

Ok.

My mail to the gemstone list led to a ticket about removing class
checks from Association. That would be easing the handling a lot.

Norbert


Alexandre


Here there is an assumption about the allAttributes collection
while using = as comparsion. But there is also an assumption
about the order of the content. I changed this to

self assert: (firstPerson allAttributes includesAllOf:
#(#'first-name' #'employee-number' #'family-name')).
self assert: (firstPerson allAttributeAssociations asArray
includesAllOf: {(#'first-name'->'Bob'). (#'employee-number'-
'A0000'). (#'family-name'->'Gates')}).

Very right. My mistake. But wouldn't an asSortedCollection do
the thing? Do you not test the size of the array.

This is not the best way to do because the check is only in
one direction but for this test it is ok. Somehow the second
assert fails and I have to check what is going on here.

Yeah, my mistake. Sorry. The elements may be differently
ordered. Would a asSortedCollection help?

I have now granted you an access to the repository. You should
be able to directly commit in it.

Jaayer, what is your Squeaksource account?

Cheers,
Alexandre




On 28.02.2010, at 02:01, Alexandre Bergel wrote:

thanks for now. I did a first merge attempt. It will be
quite a bit of work. For me the xml parser is an important
component. With the newest changes it became biased towards
pharo. There are things like ClassTestCase, Unicode
CharacterSet. These are for sure improvements/changes in
pharo you like to use. But they make porting a lot more
difficult. I would be glad if we could find some way to
lower the porting barrier. The necessary class I could put
in the squeak compat package in gemstone. But then the xml
parser will depend on the squeak package which I don't like.


Hi Norbert,

XMLParser effectively depends on Squeak specific classes. I
wrote a small script that identify the squeak classes used in
XML-Support. Here is the list: LanguageEnvironment, Unicode,
LocaleID, CharacterSet

I guess that porting the whole multilingual support may not
be that easy. The tag xml:lang is used to select the proper
support. It should be easy for you to ignore it I guess.

CharacterSet seems to be one that has to be ported. It is not
a big class. It depends on WideCharacterSet. I am not sure
whether this is useful in your case however.

Cheers,
Alexandre

--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.







--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.







--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.







--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.







--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.


--
_,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
Alexandre Bergel  http://www.bergel.eu
^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.






_______________________________________________
Pharo-project mailing list
Pharo-project@lists.gforge.inria.fr
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to